ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.7k stars 5.73k forks source link

[serve] Requests should be sent only among replicas of the same application version #36216

Open zcin opened 1 year ago

zcin commented 1 year ago

What happened + What you expected to happen

Currently deployment handles route traffic to all replicas within a deployment regardless of version. However, this breaks semantics of application upgrades – users should be able to safely change interfaces of deployments within an application.

We should enforce that traffic within an application can only be routed between replicas of the same version. We want to always guarantee that only replicas with the same application version should be sending requests to each other.

Something to be aware of is that after the controller crashes, during the recovery period the controller will not know what code version is running on each live replica actor.

Versions / Dependencies

ray 2.5

Reproduction script

n/a

Issue Severity

None

anyscalesam commented 4 months ago

@zcin is this fixed?