uqfoundation / dill

serialize all of Python
http://dill.rtfd.io
Other
2.26k stars 179 forks source link

Function namespace preservation #466

Open anivegesana opened 2 years ago

anivegesana commented 2 years ago

Continuing the discussion from https://github.com/uqfoundation/dill/issues/462#issuecomment-1112761901

Even cloudpickle's behavior is a little bit finicky and has some unintuitive behavior. I think that there are design decisions that need to be made with regards to how we handle global dictionaries and functions.

import cloudpickle

def f():
    global x
    x = 5

def g():
    global x
    x = 0
    f()
    assert x == 5

G, GG = cloudpickle.loads(cloudpickle.dumps((g, g.__globals__)))

G.__globals__ != GG # Seems odd
anivegesana commented 2 years ago

So far, I have just written a proxy object that is supposed to effectively trap whenever the global variables dictionary is pickled (analogous to trapping segfaults to load memory in demand paging) and update the cache with the keys that have already been written out. I still need to handle pickling the GlobalVars object itself, pickling a globals dictionary directly outside of the __globals__ attribute of a function, and updating the _save_function to use this mechanism instead of the copying that it does now.

https://github.com/anivegesana/dill/tree/shared_namespace

mmckerns commented 2 years ago

@anivegesana: the following is the current behavior in dill:

Python 3.8.13 (default, May 10 2022, 11:26:38) 
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> def f():
...   global x
...   x = 5
... 
>>>
>>> def g():
...   global x
...   x = 0
...   f()
...   assert x == 5
... 
>>> G,GG = dill.loads(dill.dumps((g, g.__globals__)))
>>> G.__globals__ == GG
True
>>> g()
>>> G,GG = dill.loads(dill.dumps((g, g.__globals__)))
>>> G.__globals__ == GG
True
>>> 

Which is different than what you are reporting from cloudpickle. It's not clear what you want done in this issue. Can you clarify what this issue is reporting?

anivegesana commented 2 years ago

That piece of code works when recurse is off. The issue is that the same equalities don't hold when recurse is on.

mmckerns commented 2 years ago

Aha... using recuse=True is when dill is the most like cloudpickle... so that makes sense. Ok, got it.