uqfoundation / dill

serialize all of Python
http://dill.rtfd.io
Other
2.25k stars 179 forks source link

Pickling, especially unpickling, is not thread-safe. #350

Open tvalentyn opened 4 years ago

tvalentyn commented 4 years ago

CPython pickler appears to be not thread-safe, and frequently causes deadlocks or corrupted module imports on Python 3.x when we concurrently unserialize pickles that reference same modules in the pickled payload [1].

A common manifestation of this error for Dill users is:

  File "python3.5/site-packages/apache_beam/internal/pickler.py", line 265, in loads
    return dill.loads(s)                                                           
  File "python3.5/site-packages/dill/_dill.py", line 317, in loads                 
    return load(file, ignore)                                                      
  File "python3.5/site-packages/dill/_dill.py", line 305, in load                  
    obj = pik.load()                                                               
  File "python3.5/site-packages/dill/_dill.py", line 474, in find_class            
    return StockUnpickler.find_class(self, module, name)                           
AttributeError: Can't get attribute 'ClassName' on <module 'ModuleName' from 'python3.5/site-packages/filename.py'>

Another common manifestation is:

    from some_package import some_module  
  File "<frozen importlib._bootstrap>", line 968, in _find_and_load
  File "<frozen importlib._bootstrap>", line 149, in __enter__
  File "<frozen importlib._bootstrap>", line 94, in acquire
_frozen_importlib._DeadlockError: deadlock detected by _ModuleLock('some_package.some_module') at 139642620901192

I am not sure yet whether this is WAI from CPython perspective, but I wanted to raise this issue for visibility in Dill land. Our mitigation plan for now is to guard pickle operations with a lock [2].

[1] https://issues.apache.org/jira/browse/BEAM-8651?focusedCommentId=16977923&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16977923 [2] https://github.com/apache/beam/pull/10167

tvalentyn commented 4 years ago

See also: https://bugs.python.org/issue38884 https://bugs.python.org/issue34572

redstoneleo commented 2 years ago

Any way to recover the corrupted data ? I suffered the same problem https://groups.google.com/g/comp.lang.python/c/FFfiN_TayfE/m/YUgdlG0PDwAJ

mmckerns commented 2 years ago

@redstoneleo: I'd agree with the commenters in the linked discussion... you can sometimes, with a lot of hard work, recover a corrupted pickle by hand, and he best way to guard against reading and writing to the same file is to (as @tvalentyn suggests) guard the operation with a lock.

@tvalentyn: this is still an open issue for CPython pickling, correct? You can see from the traceback that dill falls down on this issue when it is relying on the CPython pickler.

tvalentyn commented 2 years ago

As far as i know it's not fixed in CPython

Yarin-Shitrit commented 4 months ago

This wasn't solved yet I guess ?