rwth-i6 / sisyphus

A Workflow Manager in Python
Mozilla Public License 2.0
45 stars 25 forks source link

Use `dill` for pickling #192

Closed Icemole closed 5 months ago

Icemole commented 5 months ago

Hi all, I found this dill package that allows serializing some more key elements that pickle can't serialize, such as lambda functions or closures.

I think using this in sisyphus would be very interesting for some test cases in which it's more interesting and comfortable to use a lambda function or a closure, among others. According to the dill repo, it's a drop-in replacement if import dill as pickle is used.

Would you be interested in adding this package to sisyphus?

albertz commented 5 months ago

pickle is stable and future/backwards compatible. dill does not make any such guarantees. So I strongly would prefer pickle.

If you want to serialize some objects that pickle cannot serialize out-of-the-box, it's always possible to extend the pickle serialization. E.g. I have done that for lambda functions and closures. But then you get back to the problem that this is likely very dependent on Python version and not compatible across Python versions.

Also, actually we intentionally do not want to serialize lambda functions or closures but use a well defined object instead, as the hash of such objects is also not really well defined. See sis_hash_helper. It specifically does not allow to hash lambdas or closures.

Instead of using closures, just use functools.partial. That will allow you to pass on any extra arguments, and should still be fully compatible with pickle and sis_hash_helper.

critias commented 5 months ago

I agree 100% with Albert.