ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.61k stars 5.71k forks source link

[workflow] create a virtual actor from a virtual actor #20515

Closed fishbone closed 2 years ago

fishbone commented 2 years ago

When one virtual actor is created, sometimes the user want to duplicate the virtual actor like fork. One way to do this is to get the content of the virtual actor and create another virtual actor. But there is one limitation for this way since sometimes the user doesn't have the code.

Another and more efficient way to do this is to copy from storage. Here I propose the following API:

@PublicAPI(stability="beta")
def get_actor(actor_id: str, create_from:Optional[str]=None ) -> "VirtualActor":

workflow.get_actor(actor_id="xxx", from_actor_id="yyy")

so this will create a virtual actor xxx from virtual actor yyy. It xxx has already been created, it'll just return.

fishbone commented 2 years ago

@ericl could you share some comments about the above API proposal?

ericl commented 2 years ago

Hmm, I feel it would be more general if we added get_state / set_state methods on actors. So you could write:

actor2.set_state(actor1.get_state()). This could be useful for debugging or manually resetting the state of the actors anyways.

yuanchi2807 commented 2 years ago

Confirming actor_id respects namespace?

klwuibm commented 2 years ago

@iycheng In the following code segment, would the estimator inside new_va be a different object from the estimator inside old_va? Or new_va and old_va might still share the same estimator object?

@ray.workflow.virtual_actor
class MLNode():
    def __init__(self, estimator: BaseEstimator):
        if estimator is not None:
            self.estimator = estimator

    @workflow.virtual_actor.readonly
    def get_model(self):
        return self.estimator

    def __getstate__(self):
        return self.estimator

    def __setstate__(self, estimator):
        self.estimator = estimator

logistic = LogisticRegression(max_iter=10000, tol=0.1)
old_va = MLNode.get_or_create('old_va_name', logistic)
new_va = workflow.get_or_create('new_va_name', 'old_va_name')
fishbone commented 2 years ago

Hmm, I feel it would be more general if we added get_state / set_state methods on actors. So you could write:

actor2.set_state(actor1.get_state()). This could be useful for debugging or manually resetting the state of the actors anyways.

@ericl I think it's because you don't have the Actor class so you can't actor2.set_state(actor1.get_state) Btw, the reason put it in this way is that the code could be very simple&efficient to implement, just copy-paste the data underneath.

fishbone commented 2 years ago

Confirming actor_id respects namespace?

@yuanchi2807 right now we don't have namespace concept, so no namespace here. But later when we add it, it should have an extra field: from_namespace It'll be nice if @lchu-ibm can help on this one https://github.com/ray-project/ray/issues/18818

fishbone commented 2 years ago

@iycheng In the following code segment, would the estimator inside new_va be a different object from the estimator inside old_va? Or new_va and old_va might still share the same estimator object?

@ray.workflow.virtual_actor
class MLNode():
    def __init__(self, estimator: BaseEstimator):
        if estimator is not None:
            self.estimator = estimator

    @workflow.virtual_actor.readonly
    def get_model(self):
        return self.estimator

    def __getstate__(self):
        return self.estimator

    def __setstate__(self, estimator):
        self.estimator = estimator

logistic = LogisticRegression(max_iter=10000, tol=0.1)
old_va = MLNode.get_or_create('old_va_name', logistic)
new_va = workflow.get_or_create('new_va_name', 'old_va_name')

They will be different, even in the storage layer.

you should treat them as a different virtual actor.

fishbone commented 2 years ago

btw, i update the api to use workflow.get_actor. please check the description for the update

lchu-ibm commented 2 years ago

@iycheng sure, I can look into the namespace feature.

fishbone commented 2 years ago

Close given that we are not supporting virtual actor now.