Closed ml31415 closed 8 years ago
On Monday, January 11, 2016 01:53:26 AM Michael wrote:
File "/usr/local/lib/python27/dist-packages/persistent/persistencepy", line 267, in getattribute oga(self, '_p_accessed')() KeyboardInterrupt Any ideas what went wrong here? (Linux, x86_64, cpython 2710)
It looks like you are using the Python version of persistence? IW, why are you inside the persistencepy module?
Regards,
Stephan Richter Entrepreneur & Geek
Well, I actually didn't intend to use an pure python version. After having a look into the commit history, looks like this pure python stuff is something rather new, and it probably got switched on automatically after the update. In that case, I suppose it's not an issue of the FileStorage and my question is rather, how to get rid of this pure python stuff. Thanks for pointing me into the right direction!
I suggest looking at the build process. If you build on a machine without a C compiler, then the installation will fall back to the Python version.
I really hate magical fallbacks like this. :(
I totally agree. Even worse, as now one of the BTrees inside the db got converted to one of these python btrees and this OOBTreePy doesnt feature a _findbucket method and I get other crashes now ... Such a hazzle on a minor version upgrade oO
Note that this has nothing to do with what's in the database. This only effects the application code. "Python" and "C" objects are stored in the database identically.
Can you share a traceback showing the error on _findbucket?
Sure
Traceback (most recent call last):
...
File "...utils/database/database.py", line 424, in lookup
return self.root[cls_search_str][index][key]
File ".../utils/database/database.py", line 48, in __getitem__
return self._tree.__getitem__(key)
File "/usr/local/lib/python2.7/dist-packages/BTrees/_base.py", line 1167, in __getitem__
bucket = self._findbucket(key)
File "/usr/local/lib/python2.7/dist-packages/BTrees/_base.py", line 803, in _findbucket
return child._findbucket(key)
AttributeError: 'BTrees.OOBTree.OOBTree' object has no attribute '_findbucket'
Referring to This only effects the application code. "Python" and "C" objects are stored in the database identically.
: If I remember correctly, the full class path is persistet. So even if the internal structure of the data may be the same, as soon as the object is getting unpickled, It picks the class that it was before.
So I don't know exactly, how I messed that up, but what was a C tree before in my DB, now keeps coming back as a python tree, and then the problems start.
The C and Python versions have the same class location, so, when unpickling, the C versions is used is it's available and otherwise the Python version is used.
From the traceback you shared, it looks like you have a mix of C and Python classes. I would consider doing a software rebuild.
The C and Python versions have the same class location, so, when unpickling, the C versions is used is it's available and otherwise the Python version is used.
Hmm, I'm not sure that's actually true, at least in half the possible scenarios. It looks like the 'Py' suffix is encoded in the pickle of the Python implementation. Here we are in an environment with PURE_PYTHON=1:
>>> import pickle, BTrees
>>> bt = BTrees.OOBTree.OOBTree()
>>> bt
<BTrees.OOBTree.OOBTree object at 0x1062f1a50>
>>> type(bt) # It's really OOBTreePy
<class 'BTrees.OOBTree.OOBTreePy'>
>>> pickle.dumps(bt)
'ccopy_reg\n__newobj__\np0\n(cBTrees.OOBTree\nOOBTreePy\np1\ntp2\nRp3\n.'
And if I load that pickle in an environment that has the C implementation, I still get the Py version back:
>>> import pickle,BTrees
>>> type(BTrees.OOBTree.OOBTree()) # Prove we have the C version
<type 'BTrees.OOBTree.OOBTree'>
>>> pickle.loads('ccopy_reg\n__newobj__\np0\n(cBTrees.OOBTree\nOOBTreePy\np1\ntp2\nRp3\n.')
<BTrees.OOBTree.OOBTree object at 0x11055d0d8> # "looks" good, but...
>>> bt = pickle.loads('ccopy_reg\n__newobj__\np0\n(cBTrees.OOBTree\nOOBTreePy\np1\ntp2\nRp3\n.')
>>> type(bt) # Unpickles to Py
<class 'BTrees.OOBTree.OOBTreePy'>
I think what you say holds true going the other way, from C to Python, because the C pickle only references the "naked" class. Here's the C version:
>>> pickle.dumps(BTrees.OOBTree.OOBTree())
'ccopy_reg\n__newobj__\np0\n(cBTrees.OOBTree\nOOBTree\np1\ntp2\nRp3\n.'
And loaded in the PURE_PYTHON environment:
>>> pickle.loads('ccopy_reg\n__newobj__\np0\n(cBTrees.OOBTree\nOOBTree\np1\ntp2\nRp3\n.')
<BTrees.OOBTree.OOBTree object at 0x1061866d0>
>>> bt = pickle.loads('ccopy_reg\n__newobj__\np0\n(cBTrees.OOBTree\nOOBTree\np1\ntp2\nRp3\n.')
>>> type(bt)
<class 'BTrees.OOBTree.OOBTreePy'>
Because the AttributeErrors
take different forms between the two implementations, it's possible to say that the above traceback has a "root" Python implementation, but a "child" that's the C implementation (a python implementation error looks like AttributeError: 'OOBTreePy' object has no attribute 'thing'.
while the C implementation looks like AttributeError: 'BTrees.OOBTree.OOBTree' object has no attribute 'thing'
.)
And loaded in the PURE_PYTHON environment:
But it doesn't produce a valid OOBTreePy. This can be seen by pickling the object again or using it in any way:
>>> bt = pickle.loads('ccopy_reg\n__newobj__\np0\n(cBTrees.OOBTree\nOOBTree\np1\ntp2\nRp3\n.')
>>> type(bt)
<class 'BTrees.OOBTree.OOBTreePy'>
>>> bt.get('foo')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "//lib/python2.7/site-packages/BTrees/_base.py", line 1187, in get
bucket = self._findbucket(key)
File "//site-packages/BTrees/_base.py", line 809, in _findbucket
index = self._search(key)
File "//site-packages/BTrees/_base.py", line 791, in _search
data = self._data
AttributeError: _data
>>> pickle.dumps(bt)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "//lib/python2.7/pickle.py", line 1380, in dumps
Pickler(file, protocol).dump(obj)
File "//lib/python2.7/pickle.py", line 224, in dump
self.save(obj)
File "//lib/python2.7/pickle.py", line 306, in save
rv = reduce(self.proto)
File "//site-packages/BTrees/_base.py", line 989, in __getstate__
data = self._data
AttributeError: _data
I suspect this discussion should move to the BTrees tracker. I'll open an issue there.
Is the automatic fallback to the pure python version actually intended? Imho it is such a huge performance regression (1s vs 6min just for opening the db), that no sane cpython user would ever want that, it only makes sense for pypy. If not so, for which subproject should I file the bugreport?
I believe the automatic fallback is intended. At least, that's the pattern used by all of the zopefoundation projects I've looked at that have extension classes, including persistent, BTrees, Acquisition, ExtensionClass, zope.proxy, ...
The automatic fallback is indeed intended. It occurs to me now that it's a bug magnet.
In any case, the use or non-use of Python should be transparent to the database, IMO. If it isn't, then the explicit selection of Python or C is especially important.
I rather have my program instantly fail (due to missing dependencies or whatever) when connecting to a db, than automatically picking another implementation and risking corruption of my database. And I guess that should be true for the vast majority of people.
And even if this pickling issue should hopefully be fixed soon, I'd still let this choice to the user. I just don't see any realistic scenario, where you'd be happy with an implementation, that you didn't intend to get.
On Tue, Jan 12, 2016 at 3:03 AM, Michael notifications@github.com wrote:
I rather have my program instantly fail (due to missing dependencies or whatever) when connecting to a db, than automatically picking another implementation and risking corruption of my database. And I guess that should be true for the vast majority of people.
This isn't a run-time issue, even though you're seeing the symptoms there. We should come up with a way to express in your package dependencies whether you want a Python or C implementation.
Jim
And even if this pickling issue should hopefully be fixed soon, I'd still let this choice to the user. I just don't see any realistic scenario, where you'd be happy with an implementation, that you didn't intend to get.
— Reply to this email directly or view it on GitHub https://github.com/zopefoundation/persistent/issues/32#issuecomment-170832079 .
Jim Fulton http://jimfulton.info
I have a rather large database, about 50 GB of data. Before I updated my ZODB installation lately, opening it was rather instant, about a second. Since the upgrade from 4.0.9 to 4.1.1 it takes about 6 minutes to open it. Interrupting this process shows the following stack trace:
Any ideas what went wrong here? (Linux, x86_64, cpython 2.7.10)