Open jmuchemb opened 6 years ago
As I understand it, ‘noload’ is dead in the python stdlib. However, that is a core requirement of ZODB and hence this library. (If you haven’t tried to GC a large database in an environment without ‘noload’, be prepared to wait. A lot )
Well, I know all that. But there isn't any mention of the removal in the doc or the source code. In the Git history, cPickle.c was removed in 2007, and _pickle.c was added in 2008 (as part of https://bugs.python.org/issue2917), with useless commit messages.
Have we at least asked to get it back in Python 3 ?
Even in Python 2.7, noload
is broken for ZODB. The code here uses its Python 2.6 behaviour (see #9), but that has Python bug 1101399. That bug doesn't matter for the ZODB use case, but it makes noload
less generic, which is a concern for the stdlib.
Given that Python 2.7 has a noload
, but it's broken for ZODB (and the commit that broke it refers to it as an "undocumented, obscure function" and didn't bother to add a test case), and Python 3.x has never had a noload, I would imagine the barrier to get it added to Python 3 would be fairly high. It would have to fix the same bug that 2.7 did but without breaking in the more general case (as well as be documented and tested, presumably).
The window for feature additions to Python 3.7 has closed, so at the earliest such an addition would be part of is 3.8 and we would need this library as long as ZODB supported any versions < 3.8.
I'm not saying it can't happen, but it sounds like a lot of work 😄 I would guess that a starting point would be posting to python-ideas.
Thank you. I didn't know that the status was already so bad for Python 2.7.
In any case, we should also have a clear view of what we change, which is not easy when our code is based on old versions of the stdlib. A first step would be to sanitize our code, by having an upstream branch containing unmodified code of pickle from Python 3, and merge it from time to time to get new features and bugfixes. This way:
Recent discussion: https://github.com/zopefoundation/zodbpickle/pull/64#issuecomment-1025732450
My current plan is to have three versions of the stdlib _python.c
fork in zodbpickle
:
_pickle_33.c
version will stick around only for folks needing extreme pickle-level Python2 compatibility: i.e., they have FileStorages which still contain (or might?) pickles generated under Python2, which must be loaded "correctly" under Python3. I'm not sure how users should signal their need for this version: an environment variable? Making an explicit function call?_pickle_38.c
fork, which adds back the Unpickler.noload
method, using the current Python 3.8 load
as a template, and adds support for the new protocol 4 and 5 opcodes, as well as adopting the change to use module state. This version will be built for all Python3 versions < 3.12._pickle_312.c
fork, applying similar transforms to restore Unpickler.noload
, with additional changes to accommodate the fact that in 3.12, the Unpickler
type is heap-allocated. This version will be built for Python versions >= 3.12.This change will also need to re-fork the pickletools
modules, to account for the new protocol support, and involves adjusting a bunch of tests.
I'm most of the way through the changes to the 3.8 version in C code, but wanted to get some discussion going on the approach before I go much further.
See also https://github.com/zopefoundation/zodbpickle/pull/92 and the comment I made there about probably no longer needing _pickle_33.c
.
@tseaver Thank you for investing time into updating this library to recent changes in the pickle
universe.
zodbpickle has recently entered Debian. And since it forks code from Python, it led to the following discussion: https://lists.debian.org/debian-security-tracker/2018/04/msg00021.html
In particular, I discovered that https://bugs.python.org/issue6784 is fixed and if we manage to get back the
noload
operation in Python 3 (it existed in Python 2), we could stop forking the stdlib modules.errors='bytes'
(used inZODB._compat
), but I guess we can achieve the same result with: