zodb / relstorage

A backend for ZODB that stores pickles in a relational database.
Other
54 stars 46 forks source link

Relstorage cache problem? #492

Closed tflorac closed 1 year ago

tflorac commented 2 years ago

Hi,

I'm using RelStorage in a Pyramid application using Python 3.5 and RelStorage 3.4.5 with PostgreSQL (using PsycoPg2). Sometimes, I get this error:

  ...
  File "/var/local/eggs/pyams_utils-0.1.44-py3.5.egg/pyams_utils/adapter.py", line 158, in get_annotation_adapter
    adapter = annotations.get(key)  # pylint: disable=assignment-from-no-return
  File "/var/local/eggs/zope.annotation-4.7.0-py3.5.egg/zope/annotation/attribute.py", line 68, in get
    return annotations.get(key, default)
  File "/var/local/eggs/ZODB-5.6.0-py3.5.egg/ZODB/Connection.py", line 795, in setstate
    self._reader.setGhostState(obj, p)
  File "/var/local/eggs/ZODB-5.6.0-py3.5.egg/ZODB/serialize.py", line 634, in setGhostState
    obj.__setstate__(state)
SystemError: new style getargs format but argument is not a tuple

After application restart, the problem disappears...

Could this be because of a RelStorage cache problem?

Best regards,

Thierry

jamadden commented 1 year ago

Could this be because of a RelStorage cache problem?

Unlikely. This is a low-level SystemError that can only be produced by C code doing incorrect thing; it arises from a call to something like PyArg_Parse being given a non-tuple --- C functions get a object args argument that they break apart using a call like that.

Given the traceback, it's likely that obj is a component of an OOBTree, and you can definitely get an error like this if you pass the C code of OOBTree something it doesn't like (which is really a bug in OOBTree); the Python implementation does better:

>>> from BTrees import OOBTree
>>> OOBTree.Bucket().__setstate__(1)
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:1                                                                                    │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
SystemError: new style getargs format but argument is not a tuple
>>> OOBTree.BucketPy().__setstate__(1)
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:1                                                                                    │
│                                                                                                  │
│ //lib/python3.10/site-packages/BTrees/_base.py:467 in   │
│ __setstate__                                                                                     │
│                                                                                                  │
│    464 │   │   return (data, )                                                                   │
│    465 │                                                                                         │
│    466 │   def __setstate__(self, state):                                                        │
│ ❱  467 │   │   if not isinstance(state[0], tuple):                                               │
│    468 │   │   │   raise TypeError("tuple required for first state element")                     │
│    469 │   │                                                                                     │
│    470 │   │   self.clear()                                                                      │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: 'int' object is not subscriptable

So most likely there's a mismatch between the type of state and the class of obj. The database, and the cache, is keyed primarily by OID (_p_oid), and that's why I say it being a caching problem is unlikely; instead, I would be suspicious of something playing low-level tricks with OIDs, or manipulating the database directly (e.g., assigning an OID directly, leading to a class mismatch with what's in the local Connection's ghost cache, or restoring a backup while running). A restart (or clearing the ghost cache) would force a new ghost object to be created with the right class as read from the database.

Closing for now, but if this is something you can still reproduce, feel free to re-open and we can look at some more steps for debugging (in particular, we'd want to get the state value and ideally the entire pickled row from the database for that OID at the same time).

tflorac commented 1 year ago

Hi Jason, I'll try to reproduce the problem. But as far as I can remember, this only occurs in production (where I use Apache and mod_wsgi) and not in development (where I just use Pyramid's "pserve" to run the application). Best regards, Thierry