ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.75k stars 5.74k forks source link

[Core] using `@ray.remote` on an extended class with __init_subclass__ causes the metaclass to initialize multiple times #25491

Closed anadahalli closed 2 years ago

anadahalli commented 2 years ago

What happened + What you expected to happen

@ray.remote on an extended class with __init_subclass__ causes the metaclass to initialize multiple times.

The same can be observed also while creating the actor from the actor class.

Versions / Dependencies

Ray version == 1.12.1

Reproduction script

import ray

ray.__version__  # 1.12.1

class Base:
    def __init_subclass__(cls, *args, **kwargs):
        print("init_subclass")

@ray.remote
class Actor(Base):
    def __init__(self):
        print("init")

""" stdout
init_subclass 
init_subclass
init_subclass 
"""

a = Actor.remote()

""" stdout
(pid=595687) init_subclass
(pid=595687) init_subclass
(Actor pid=595687) init
"""

Issue Severity

Medium: It is a significant difficulty but I can work around it.

tupui commented 2 years ago

Hi @anadahalli, thank you for reporting. This is due to a few things. When an actor is created:

  1. The class is initialized
  2. DerivedActorClass is called to actor class we are constructing inherits from the original class so it retains all class properties

I am not sure about where the third call is coming from though. EDIT actually I know: this is just being called when you create the Actor class itself and not linked to ray. You can remove the decorator and you will see the print although you are not calling Actor.__init__.

You would use __init__ instead of __init_subclass__ and there would be no multiple execution. Any reasons why you are using __init_subclass__? Do you really need metaclasses?

anadahalli commented 2 years ago

@tupui, thank you for helping me understand this better.

I'm trying to create a microservices framework using ray similar to nameko where both services and extensions are implemented as ray actors.

My current approach to injecting the providers and registering the entrypoints of the all extensions of a service is based on the metaclass as this feels more pythonic to write.


# framework
class Service:
    def __init_subclass__(cls, *args, **kwargs):
        # collect all extensions and create support actors for them
        # inject extension providers (actor handle) to service class
        # register entrypoint methods with the extension actor
        # register on_init, atexit, health, metrics, tracers, etc
        # register the created actor with central registry

class Extension:
    def __init_subclass__(cls):
        ...
    def start(self):
        ...
    def stop(self):
        ...

# developer
class Scheduler(Extension):
    @classmethod
    def schedule(cls, **kwargs):
        ...

class Pubsub(Extension):
    @classmethod
    def subscribe(cls, topic, callback):
        ...

class MyService(Service):
    # injected during runtime
    config = Config()    

    # registers the actor method with the extension (dependency actor)
    @schedule("interval", seconds=10)
    def task(self):
        ...

    @subscribe("topic")
    def callback(self, msg):
        ...

service = ray.remote(MyService).remote()

Can you suggest a better approach to do this in ray?

Thanks.

mattip commented 2 years ago

Metaclasses are one way to condense more verbose code into more concise code by using metaprogramming. I would start with writing the more verbose code first, and then see how to proceed.

Start with your higher-level APIs, get your tests in place to exercise them, implement as simply as possible, and only then start to refactor things to use more advanced techniques.