llllllllll / cloudpickle-generators

cloudpickle support for generators, including partially consumed generators.
BSD 3-Clause "New" or "Revised" License
17 stars 6 forks source link

When trying to load generator with send and some closing state can't set attributes of built-in/extension type 'cloudpickle_generators._core.unset_value_type #6

Open merc1031 opened 5 years ago

merc1031 commented 5 years ago

Notes: Running python3.6 on ubuntu

I am trying to use your project to make this library (https://github.com/JadenGeller/Guac) work on cpython. In particular i am changing line 40 in this file (https://github.com/JadenGeller/Guac/blob/master/guac/monad.py) to cloudpickle.loads(cloudpickle.dumps(continuation)) after registering cloudpickle_generators at the top of the file.

then i simply copied Guac's make_change example into example.py and tried to run it and i get the following error.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/code/guac/monad.py", line 109, in monadic_context
    return monad._run(f(*args, **kwargs))
  File "/code/guac/monad.py", line 56, in _run
    return step(next(computation), computation)
  File "/code/guac/monad.py", line 54, in step
    return self.bind(monadic_value, proceed)
  File "/code/guac/instances.py", line 32, in bind
    result += f(elem)
  File "/code/guac/monad.py", line 49, in proceed
    return step(invocation.send(x), invocation)
  File "/code/guac/monad.py", line 54, in step
    return self.bind(monadic_value, proceed)
  File "/code/guac/instances.py", line 32, in bind
    result += f(elem)
  File "/code/guac/monad.py", line 47, in proceed
    invocation = cloudpickle.loads(d) # Requires Pypy!
  File "/code/src/cloudpickle/cloudpickle/cloudpickle.py", line 1250, in _rehydrate_skeleton_class
    setattr(skeleton_class, attrname, attr)
TypeError: can't set attributes of built-in/extension type 'cloudpickle_generators._core.unset_value_type'

I have an example repo at https://github.com/merc1031/Guac/tree/test_cloudpickle

To repro you should be able to simple pip install -r requirements.txt followed by

python3
>>> import example
llllllllll commented 5 years ago

Thank you for writing such a wonderful bug report with a simple repro. I will look into this right now.

Side-note: I wonder if I should try to upstream at least copy.deepcopy for generators to CPython. There is going to be a lot of overhead involved with using cloudpickle to deep copy in process because it is designed for inter-process communication.

merc1031 commented 5 years ago

As far as ive seen the python devs are not really interested in deepcopy or pickle for generators (in the few forum posts ive seen). Obviously I'd be all for first party support.

For the time being if cloudpickle can do the copy it would be pretty neat for this experiment.

On a side note: Can you think of any other/more efficient way to do the pickling/copying of this kind of generator (in my example)? The code in the original Guac repo is for pypy (which can copy generators) not cpython, but I have no idea how efficent it may be even over there.

llllllllll commented 5 years ago

with https://github.com/llllllllll/cloudpickle-generators/pull/7 I was able to run example.py from your repo, can you confirm this works for you also?

llllllllll commented 5 years ago

As far as the deepcopy thing goes, would you expect that the f_globals get copied and all of the global variables also get copied? If not, I have a prototype that is much faster than a cloudpickle roundtrip.

merc1031 commented 5 years ago

7 seems to work for the example.

I wont be able to try it against more robust testing till tomorrow or Monday, but it seems to hold up for now.

I'd be interested in seeing the prototype code as well. What I'm trying to simulate is something akin to Haskells Monads, so really they shouldn't be allowed to access truly global things.

llllllllll commented 5 years ago
diff --git a/cloudpickle_generators/__init__.py b/cloudpickle_generators/__init__.py
index 8653ab2..2726ebe 100644
--- a/cloudpickle_generators/__init__.py
+++ b/cloudpickle_generators/__init__.py
@@ -1,3 +1,4 @@
+import copy
 from itertools import chain
 import pickle
 from types import FunctionType, GeneratorType
@@ -193,3 +194,45 @@ def unregister():
         # make sure we are only removing the dispatch we added, not someone
         # else's
         del CloudPickler.dispatch[GeneratorType]
+
+
+def deepcopy(ob, memo=None):
+    if not isinstance(ob, GeneratorType):
+        return copy.deepcopy(ob, memo=memo)
+
+    gen = ob
+    frame = gen.gi_frame
+    if frame is None:
+        # frame is None when the generator is fully consumed; take a fast path
+        return gen
+
+    f_code = frame.f_code
+
+    # Create a copy of generator function without the closure to serve as a box
+    # to store in the memo dict while we copy the locals. This defends against
+    # cycles where the instance appears in its own closure.
+    gen_func = FunctionType(
+        f_code,
+        frame.f_globals,  # don't copy the globals
+        gen.__name__,
+        (),
+        (_empty_cell(),) * len(f_code.co_freevars),
+    )
+    try:
+        gen_func.__qualname__ = gen.__qualname__
+    except AttributeError:
+        # there is no __qualname__ on generators in Python < 3.5
+        pass
+
+    skel = _create_skeleton_generator(gen_func)
+    if memo is None:
+        memo = {}
+
+    # Put the skeleton in the memoize dict before copying the locals
+    # if the generator is in a closure, then the instance itself may appear
+    # in its own locals.
+    memo[id(skel)] = skel
+    f_locals = copy.deepcopy(frame.f_locals, memo=memo)
+
+    _fill_generator(skel, frame.f_lasti, f_locals, private_frame_data(frame))
+    return skel

sorry I did't post this earlier, this diff is what I was thinking of. It is like 200x faster than pickle.loads(pickle.dumps(gen)) for simple functions, but probably still much faster for interesting functions. I will post a PR with this and some tests tomorrow.

merc1031 commented 5 years ago

I seem to be getting a SEGV with both the deepcopy code above and the cloudpickle.loads/dumps pair on a generator simulating a List Monad(essentially forking the computation via copy). Not sure why yet, but Ill probably try to boil it down to a small test repo today or tomorrow.

llllllllll commented 5 years ago

Any news on this?

merc1031 commented 5 years ago

Sorry been busy with work, haven't been able to compile down a nice test case, I'll see if I can get to it soon!

On Wed, Jun 12, 2019, 06:40 Joe Jevnik notifications@github.com wrote:

Any news on this?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/llllllllll/cloudpickle-generators/issues/6?email_source=notifications&email_token=AACMAPGHPZQJEGJVP7Q5DV3P2DHBHA5CNFSM4HSBKSZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXP7VAY#issuecomment-501217923, or mute the thread https://github.com/notifications/unsubscribe-auth/AACMAPBDA7KKKFAT7LQMQYTP2DHBHANCNFSM4HSBKSZA .

merc1031 commented 5 years ago

I will be travelling into the start of July so its unlikely ill have time to make a good test case before then. Ill try to do so , otherwise itll be after July 1st