cloudpipe / cloudpickle

Extended pickling support for Python objects
Other
1.64k stars 167 forks source link

Cannot exclude members to pickle in metaclass #482

Open elbakramer opened 2 years ago

elbakramer commented 2 years ago

In the example below, I implemented __getstate__(cls) that returns empty dict in order not to pickle cls.INSTANCES and cls.LOCK members of metaclass SingletonMeta.

But it seems that cloudpickle still tries to pickle those members and fails to pickle since threading.RLock is not picklable.

import cloudpickle

from threading import RLock

class SingletonMeta(type):
    def __new__(cls, name, bases, namespace):
        return super().__new__(cls, name, bases, namespace)

    def __init__(cls, name, bases, namespace):
        super().__init__(name, bases, namespace)
        cls.INSTANCES = {}
        cls.LOCK = RLock()

    def __call__(cls, *args, **kwargs):
        key = cls.__getkey__(*args, **kwargs)
        key = tuple(key.items())
        if key not in cls.INSTANCES:
            with cls.LOCK:
                if key not in cls.INSTANCES:
                    cls.INSTANCES[key] = super().__call__(*args, **kwargs)
        instance = cls.INSTANCES[key]
        return instance

    def __getstate__(cls):
        # don't want to pickle cls.INSTANCES and cls.LOCK
        return {}

    def __setstate__(cls, state):
        pass

class SingletonClass(metaclass=SingletonMeta):
    def __init__(self, key):
        self.key = key
        self.transient = [self.key]

    def __getstate__(self):
        state = self.__dict__.copy()
        del state["transient"]
        return state

    def __setstate__(self, state):
        for name, value in state.items():
            setattr(self, name, value)
        self.transient = [self.key]

    @classmethod
    def __getkey__(cls, key):
        return {"key": key}

# singleton implementation works fine as long as metaclass = SingletonMeta
def test_singleton():
    singleton_instance1 = SingletonClass(0)
    singleton_instance2 = SingletonClass(0)
    singleton_instance3 = SingletonClass(1)
    assert singleton_instance1 is singleton_instance2
    assert singleton_instance1 is not singleton_instance3

# pickling instances works fine if metaclass != SingletonMeta
# but raises `TypeError: cannot pickle '_thread.RLock' object` if metaclass = SingletonMeta
def test_singleton_pickle():
    singleton_instance1 = SingletonClass(0)
    singleton_instance1_loaded = cloudpickle.loads(
        cloudpickle.dumps(singleton_instance1)
    )
    assert singleton_instance1 is not singleton_instance1_loaded
    assert singleton_instance1.key == singleton_instance1_loaded.key
    assert singleton_instance1.transient == singleton_instance1_loaded.transient

if __name__ == "__main__":
    # singleton implementation works fine as long as metaclass = SingletonMeta
    test_singleton()

    # pickling instances works fine if metaclass != SingletonMeta
    # but raises `TypeError: cannot pickle '_thread.RLock' object` if metaclass = SingletonMeta
    test_singleton_pickle()

    # raises "TypeError: cannot pickle '_thread.RLock' object"
    cloudpickle.dumps(test_singleton)
pierreglaser commented 1 year ago

Hi, thanks for the report. The issue lies in the fact that cloudpickle does not override pickles logic of treating classes deriving from custom metaclasses, not as instances of a metaclass, but as traditional classes, for which __getstate__ is not part of any reducing spec (since there is no official spec on reducing dynamically created classes, only an unofficial cloudpickle implementation).

To fix your case, I suspect (my reasoning comes from pickle's save logic, on which cloudpickle relies) you need a custom reducer entry within copyreg.dispatch_table for SingletonMeta instances which can piggy-back on cloudpickle's reducer for dynamically created classes, _dynamic_class_reduce. Only problem is that this function is in theory private, so it might be removed without notice.

pierreglaser commented 1 year ago

@elbakramer feel free to try the workaround I proposed. I'm likely to close this soon since unless I'm mistaken, an alternative seems possible.