cloudpipe / cloudpickle

Extended pickling support for Python objects
Other
1.63k stars 167 forks source link

integrate cloudpickle-generators into cloudpickle? #146

Open llllllllll opened 6 years ago

llllllllll commented 6 years ago

The other day I wrote a cloudpickle dispatch to serialize generators, including partially consumed generators. It currently depends on some implementation details of CPython to do it, and it requires a small C extension. As is, the code is Python 3 only; however, it would not require much work to support 2.7.

I am opening this issue to see if people would want to merge this into cloudpickle itself. The downsides are that we would be introducing a C extension, which is hard to maintain, and that it is CPython only. If the other maintainers do not want to incur the cost of adding C to the codebase, I will publish the package on PyPI under then name "cloudpickle-generators" to exist as a third-party extension to cloudpickle which I maintain separate from cloudpickle. Regardless of the decision to merge, I will try to get Python 2 support working soon.

https://github.com/llllllllll/cloudpickle-generators

holdenk commented 6 years ago

So I know some our downstream users run on things like pypy and jython, but I assume we could package it in such a way that those folks just wouldn’t have the generator support but otherwise keep working?

llllllllll commented 6 years ago

Yeah, in the setup.py we could check if we are running on CPython and then add the extension. This would be another piece of code to maintain though.

ssanderson commented 6 years ago

I don't have a super strong opinion either way, but another option would be to add an extras_require to the cloudpickle setup.py that adds cloudpickle-generators as a dependency, and then conditionally register cloudpickle generators on import. I think that would give users who care about the feature most of the benefits of having support in-tree, but would allow cloudpickle-generators to be maintained externally if we don't (yet) want to carry the C-extension here.

ogrisel commented 6 years ago

@llllllllll do you think it would be possible to do the same in pure Python using ctypes? Otherwise I think I would be in favor of the extra_requires solution proposed by @ssanderson along with binary wheelson pypi for cloudpickle_generators (see: https://github.com/matthew-brett/multibuild).

llllllllll commented 6 years ago

I don't think the project would be any more maintainable or readable with ctypes. If anything, it may hide how implementation specific it is. Even with ctypes I would be shocked if this worked on PyPy.

I haven't uploaded to PyPI yet, but would binary wheels be all that helpful for a small C89 extension?

ogrisel commented 6 years ago

Binary wheels make it [possible to install it trivially for users who don't have a compiler installed on their machine. This is very important for windows users in particular.

ogrisel commented 6 years ago

Even with ctypes I would be shocked if this worked on PyPy.

I don't expect it to work on PyPy as it relies on the CPython C-API in both cases. It's just a matter of not introducing an installation dependency on a C compiler.

The indirect users of cloudpickle might not even know what a C compiler is, nor how to install it on their system and might not have administrators rights to do so. For instance they might be students learning Python-based data analytics. Solving compiler installation issues is not.

douglas-raillard-arm commented 2 years ago

Hi @llllllllll , are there still any plans on uploading to PyPI or attempting to merge this into cloudpickle ? I am very interested by the ability to copy the state of generators (including their stack frame). It's the bit that is missing to use generator functions as full blown user-defined monads in Python.