zodb / relstorage

A backend for ZODB that stores pickles in a relational database.
Other
54 stars 46 forks source link

POSKeyError after pack postgresql database #505

Open cekk opened 2 months ago

cekk commented 2 months ago

My environment is:

What happened: i tried to pack a large db with blobs after some months of working.

I tried to do it several times without success, because there were some problems that stopped the procedure:

To solve some resource problems started launching packs with different days (40, 30, 20, 10, 0).

Every time that i had a problem and i solved it, i restarted the pack without cleaning any table on the db.

The last time that i restarted it, i've found that the db were corrupted and all the blobs returned a POSKeyError. Also after the pack has succeeded, the db had all the blobs broken.

A backup made before last pack (the one that breaks blobs), the db is ok. the db also went from 140gb to 40.

I've also tried to launch the pack on that backup but i can't reproduce the error (only once stopping the pack several times, but i can't reproduce it).

Any ideas on what could get wrong?

This is an example of traceback with errors:

2024-07-03 09:05:59,807 ERROR [ZODB.Connection:804][MainThread] Couldn't load state for Persistence.mapping.PersistentMapping 0x0e6b85 Traceback (most recent call last): File "/opt/plone/eggs/ZODB-5.8.1-py3.11.egg/ZODB/Connection.py", line 787, in setstate p, serial = self._storage.load(oid) ^^^^^^^^^^^^^^^^^^^^^^^ File "src/perfmetrics/metric.py", line 72, in perfmetrics._metric._AbstractMetricImpl.call File "/opt/plone/eggs/RelStorage-4.0.0-py3.11-linux-x86_64.egg/relstorage/storage/load.py", line 109, in load raise self.__pke(oid, relstorage.interfaces.POSKeyError: 0x0e6b85 ({'cache': <StorageCache at 0x7fcc16a119a0 hvt=286454828400357461 bytes=9842757 len=16271 state=<relstorage.cache.mvcc.MVCCDatabaseCoordinator object at 0x7fcc1693b690>>, 'load': <LoadConnection at 0x7fcc12e81cd0 active=True description={'backend_pid': 2382722} conn=<connection object at 0x7fcc13363ec0; dsn: 'user=xxx password=xxx dbname=xxx host=xxx port=5432 application_name='RS: Load'', closed: 0> cur=<cursor object at 0x7fcc12fe64d0; closed: 0>>, 'adapter': <PostgreSQLAdapter at 7fcc12e81d50 history preserving,dsn="dbname='xxx' host='xxx' port='5432' user='xxx'">, 'reason': 'no tid found', 'current_txn': 286454828400357461, 'recent_tids': '945029'})

2024-07-03 09:05:59,823 ERROR [Zope.SiteErrorLog:36][MainThread] POSKeyError: https://xxx/++api++/@secondary-menu Traceback (innermost last): Module ZPublisher.WSGIPublisher, line 181, in transaction_pubevents Module ZPublisher.WSGIPublisher, line 391, in publish_module Module ZPublisher.WSGIPublisher, line 285, in publish Module ZPublisher.mapply, line 98, in mapply Module ZPublisher.WSGIPublisher, line 68, in call_object Module plone.rest.service, line 21, in call Module plone.restapi.services, line 19, in render Module collective.volto.secondarymenu.restapi.secondary_menu.get, line 28, in reply Module collective.volto.secondarymenu.restapi.serializer.secondary_menu, line 36, in serialize_data Module design.plone.ctgeneric.restapi.serializer.summary, line 20, in call Module design.plone.contenttypes.restapi.serializers.summary, line 152, in call Module redturtle.volto.restapi.serializer.summary, line 72, in call Module plone.restapi.serializer.summary, line 99, in call Module plone.app.contentlisting.realobject, line 89, in review_state Module Products.CMFCore.WorkflowTool, line 283, in getInfoFor Module Products.DCWorkflow.DCWorkflow, line 300, in getInfoFor Module Products.DCWorkflow.DCWorkflow, line 130, in _getWorkflowStateOf Module Products.CMFCore.WorkflowTool, line 341, in getStatusOf Module Products.CMFCore.WorkflowTool, line 625, in get Module ZODB.Connection, line 787, in setstate Module perfmetrics._metric, line 72, in perfmetrics._metric._AbstractMetricImpl.call Module relstorage.storage.load, line 109, in load relstorage.interfaces.POSKeyError: 0x0e6b85 ({'cache': <StorageCache at 0x7fcc16a119a0 hvt=286454828400357461 bytes=9843160 len=16272 state=<relstorage.cache.mvcc.MVCCDatabaseCoordinator object at 0x7fcc1693b690>>, 'load': <LoadConnection at 0x7fcc12e81cd0 active=True description={'backend_pid': 2382722} conn=<connection object at 0x7fcc13363ec0; dsn: 'user=xxx password=xxx dbname=xxx host=xxx port=5432 application_name='RS: Load'', closed: 0> cur=<cursor object at 0x7fcc12fe64d0; closed: 0>>, 'adapter': <PostgreSQLAdapter at 7fcc12e81d50 history preserving,dsn="dbname='xxx' host='xxx' port='5432' user='xxx'">, 'reason': 'no tid found', 'current_txn': 286454828400357461, 'recent_tids': '945029'})

2024-07-03 09:06:18,673 ERROR [ZODB.Connection:804][MainThread] Couldn't load state for BTrees.IIBTree.IITreeSet 0x1789e1 Traceback (most recent call last): File "/opt/plone/eggs/ZODB-5.8.1-py3.11.egg/ZODB/Connection.py", line 787, in setstate p, serial = self._storage.load(oid) ^^^^^^^^^^^^^^^^^^^^^^^ File "src/perfmetrics/metric.py", line 66, in perfmetrics._metric._AbstractMetricImpl.call File "/opt/plone/eggs/RelStorage-4.0.0-py3.11-linux-x86_64.egg/relstorage/storage/load.py", line 109, in load raise self.__pke(oid, relstorage.interfaces.POSKeyError: 0x1789e1 ({'cache': <StorageCache at 0x7fcc16a119a0 hvt=286454828400357461 bytes=9843077 len=16330 state=<relstorage.cache.mvcc.MVCCDatabaseCoordinator object at 0x7fcc1693b690>>, 'load': <LoadConnection at 0x7fcc12e81cd0 active=True description={'backend_pid': 2382722} conn=<connection object at 0x7fcc13363ec0; dsn: 'user=xxx password=xxx dbname=xxx host=xxx port=5432 application_name='RS: Load'', closed: 0> cur=<cursor object at 0x7fcc12fe64d0; closed: 0>>, 'adapter': <PostgreSQLAdapter at 7fcc12e81d50 history preserving,dsn="dbname='xxx' host='xxx' port='5432' user='xxx'">, 'reason': 'no tid found', 'current_txn': 286454828400357461, 'recent_tids': '1542625'})

templiert commented 4 days ago

I think I am having the same issue, and I suspect it is due to packing. My application ran OK for several months with daily packing, using configuration 1, see below. I then had to reset my python install for other reasons, and resumed my application with configuration 2, see below. Since then, I get frequent corruptions with POSKeyErrors. I cannot unambiguously blame packing due to how my application works, but a packing always occurred before getting the errors.

I suspect that it is due to the updates I made from configuration 1 to 2. Below is also a diff of the version changes.

My application uses ZODB directly with Postgresql 16, it is not using the Zope framework.

I have now disabled daily packing. I can report back here in a week whether the errors still occur without packing.

error

E 2024-09-07T20:35:13.997 Couldn't load state for src.station.check.base.checkerslabbase.CheckerSlabBase 0x84ea96 | ZODB.Connection | 8144 | MainProcess | MainThread | 4524 | Connection | setstate | 801 | (<class 'relstorage.interfaces.POSKeyError'>, POSKeyError(b'\x00\x00\x00\x00\x00\x84\xea\x96', {'cache': <StorageCache at 0x1a7fbe42260 hvt=286867541487242410 bytes=1962801385 len=361831 state=<relstorage.cache.mvcc.MVCCDatabaseCoordinator object at 0x000001A769AA8110>>, 'load': <LoadConnection at 0x1a769dea3f0 active=True description={'backend_pid': 15036} conn=<connection object at 0x000001A769D42200; dsn: 'user=zodbuser password=xxx dbname=ibeammsem_2 host=localhost application_name='RS: Load'', closed: 0> cur=<cursor object at 0x000001A7D22DADC0; closed: 0>>, 'adapter': <PostgreSQLAdapter at 1a7b39b8110 history free,dsn='dbname=myapp_2 user=zodbuser host=localhost'>, 'reason': 'no tid found'}), <traceback object at 0x000001A8552FCC00>) | None
Traceback (most recent call last):
  File "C:\python\miniforge-pypy3\envs\myapp\Lib\site-packages\ZODB\Connection.py", line 784, in setstate
    p, serial = self._storage.load(oid)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "src\\perfmetrics\\metric.py", line 66, in perfmetrics._metric._AbstractMetricImpl.__call__
  File "C:\python\miniforge-pypy3\envs\myapp\Lib\site-packages\relstorage\storage\load.py", line 109, in load
    raise self.__pke(oid,
relstorage.interfaces.POSKeyError: 0x84ea96 ({'cache': <StorageCache at 0x1a7fbe42260 hvt=286867541487242410 bytes=1962801385 len=361831 state=<relstorage.cache.mvcc.MVCCDatabaseCoordinator object at 0x000001A769AA8110>>, 'load': <LoadConnection at 0x1a769dea3f0 active=True description={'backend_pid': 15036} conn=<connection object at 0x000001A769D42200; dsn: 'user=zodbuser password=xxx dbname=myapp_2 host=localhost application_name='RS: Load'', closed: 0> cur=<cursor object at 0x000001A7D22DADC0; closed: 0>>, 'adapter': <PostgreSQLAdapter at 1a7b39b8110 history free,dsn='dbname=myapp_2 user=zodbuser host=localhost'>, 'reason': 'no tid found'})
Traceback (most recent call last):
  ...
  File "C:\python\miniforge-pypy3\envs\myapp\Lib\site-packages\ZODB\Connection.py", line 784, in setstate
    p, serial = self._storage.load(oid)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "src\\perfmetrics\\metric.py", line 66, in perfmetrics._metric._AbstractMetricImpl.__call__
  File "C:\python\miniforge-pypy3\envs\myapp\Lib\site-packages\relstorage\storage\load.py", line 109, in load
    raise self.__pke(oid,
relstorage.interfaces.POSKeyError: 0x84ea96 ({'cache': <StorageCache at 0x1a7fbe42260 hvt=286867541487242410 bytes=1962801385 len=361831 state=<relstorage.cache.mvcc.MVCCDatabaseCoordinator object at 0x000001A769AA8110>>, 'load': <LoadConnection at 0x1a769dea3f0 active=True description={'backend_pid': 15036} conn=<connection object at 0x000001A769D42200; dsn: 'user=xxx password=xxx dbname=myapp_2 host=localhost application_name='RS: Load'', closed: 0> cur=<cursor object at 0x000001A7D22DADC0; closed: 0>>, 'adapter': <PostgreSQLAdapter at 1a7b39b8110 history free,dsn='dbname=myapp_2 user=xxx host=localhost'>, 'reason': 'no tid found'})

diff of configurations 1 and 2

< BTrees==5.0
> BTrees==6.0

< cffi @ file:///C:/ci_311/cffi_1676423759166/work
> cffi @ file:///D:/bld/cffi_1724956411237/work

< perfmetrics==4.0.0
> perfmetrics==4.1.0

< persistent==5.0
> persistent==6.0

> pg8000==1.31.2

< transaction==3.1.0
> transaction==4.0

< ZConfig==4.0
> ZConfig==4.1

< zdaemon==5.0
> zdaemon==5.1

< ZEO==5.4.1
> ZEO==6.0.0

< zipp==3.19.2
> zipp==3.20.1

< zodbpickle==3.0.1
> zodbpickle==4.0

< zope.configuration==5.0
> zope.configuration==5.0.1

< zope.container==5.1
> zope.container==6.0

< zope.exceptions==5.0.1
> zope.exceptions==5.1

< zope.hookable==5.4
> zope.hookable==6.0

< zope.i18nmessageid==6.0.1
> zope.i18nmessageid==6.1.0

< zope.interface @ file:///C:/ci_311/zope.interface_1676439868776/work
> zope.interface==7.0.3

< zope.pagetemplate==5.0
> zope.pagetemplate==5.1

< zope.proxy==5.0.0
> zope.proxy==5.3

< zope.security==6.1
> zope.security==7.1

< zstandard==0.19.0
> zstandard==0.23.0

configuration 1

BTrees==5.0
cffi @ file:///C:/ci_311/cffi_1676423759166/work
gevent==24.2.1
greenlet==3.0.3
perfmetrics==4.0.0
persistent==5.0
psycopg2==2.9.9
psycopg2cffi==2.9.0
RelStorage==4.0.0
transaction==3.1.0
zc.lockfile==3.0.post1
ZConfig==4.0
zdaemon==5.0
ZEO==5.4.1
zict==3.0.0
zipp==3.19.2
ZODB==6.0
zodbbrowser==0.17.1
zodbpickle==3.0.1
zope.annotation==5.0
zope.app.applicationcontrol==4.1.0
zope.app.appsetup==5.0
zope.app.authentication==5.0
zope.app.basicskin==5.0
zope.app.broken==5.0
zope.app.component==5.0
zope.app.container==5.0
zope.app.content==5.0
zope.app.dependable==5.0
zope.app.error==4.1.0
zope.app.exception==5.0
zope.app.folder==5.0
zope.app.form==6.0
zope.app.generations==5.0
zope.app.http==5.0
zope.app.i18n==4.1.0
zope.app.locales==5.0
zope.app.localpermission==5.0
zope.app.pagetemplate==5.0
zope.app.principalannotation==5.0
zope.app.publication==5.0
zope.app.publisher==5.0
zope.app.rotterdam==5.0
zope.app.schema==5.0
zope.app.security==6.0
zope.app.server==5.0
zope.app.session==5.0
zope.app.wsgi==5.0
zope.app.zcmlfiles==5.0
zope.applicationcontrol==5.0
zope.authentication==5.0
zope.browser==3.0
zope.browsermenu==5.0
zope.browserpage==5.0
zope.browserresource==5.1
zope.cachedescriptors==5.0
zope.component==6.0
zope.componentvocabulary==2.3.0
zope.configuration==5.0
zope.container==5.1
zope.contenttype==5.1
zope.copy==4.3
zope.copypastemove==5.0
zope.datetime==5.0.0
zope.deferredimport==5.0
zope.deprecation==5.0
zope.dottedname==6.0
zope.dublincore==5.0
zope.error==5.0
zope.event==5.0
zope.exceptions==5.0.1
zope.filerepresentation==6.0
zope.formlib==6.0
zope.generations==5.1.0
zope.hookable==5.4
zope.i18n==5.1
zope.i18nmessageid==6.0.1
zope.interface @ file:///C:/ci_311/zope.interface_1676439868776/work
zope.lifecycleevent==5.0
zope.location==5.0
zope.login==3.0
zope.minmax==2.3
zope.pagetemplate==5.0
zope.password==4.4
zope.pluggableauth==3.0
zope.principalannotation==5.0
zope.principalregistry==5.0
zope.processlifetime==3.0
zope.proxy==5.0.0
zope.ptresource==5.0
zope.publisher==7.0
zope.schema==7.0.1
zope.security==6.1
zope.securitypolicy==5.0
zope.server==4.0.2
zope.session==5.1
zope.site==5.0
zope.size==5.0
zope.tal==5.0.1
zope.tales==6.0
zope.traversing==5.0
zope.vocabularyregistry==1.2.0
zstandard==0.19.0

configuration 2

BTrees==6.0
cffi @ file:///D:/bld/cffi_1724956411237/work
gevent==24.2.1
greenlet==3.0.3
perfmetrics==4.1.0
persistent==6.0
pg8000==1.31.2
psycopg2==2.9.9
psycopg2cffi==2.9.0
RelStorage==4.0.0
transaction==4.0
zc.lockfile==3.0.post1
ZConfig==4.1
zdaemon==5.1
ZEO==6.0.0
zict==3.0.0
zipp==3.20.1
ZODB==6.0
zodbbrowser==0.17.1
zodbpickle==4.0
zope.annotation==5.0
zope.app.applicationcontrol==4.1.0
zope.app.appsetup==5.0
zope.app.authentication==5.0
zope.app.basicskin==5.0
zope.app.broken==5.0
zope.app.component==5.0
zope.app.container==5.0
zope.app.content==5.0
zope.app.dependable==5.0
zope.app.error==4.1.0
zope.app.exception==5.0
zope.app.folder==5.0
zope.app.form==6.0
zope.app.generations==5.0
zope.app.http==5.0
zope.app.i18n==4.1.0
zope.app.locales==5.0
zope.app.localpermission==5.0
zope.app.pagetemplate==5.0
zope.app.principalannotation==5.0
zope.app.publication==5.0
zope.app.publisher==5.0
zope.app.rotterdam==5.0
zope.app.schema==5.0
zope.app.security==6.0
zope.app.server==5.0
zope.app.session==5.0
zope.app.wsgi==5.0
zope.app.zcmlfiles==5.0
zope.applicationcontrol==5.0
zope.authentication==5.0
zope.browser==3.0
zope.browsermenu==5.0
zope.browserpage==5.0
zope.browserresource==5.1
zope.cachedescriptors==5.0
zope.component==6.0
zope.componentvocabulary==2.3.0
zope.configuration==5.0.1
zope.container==6.0
zope.contenttype==5.1
zope.copy==4.3
zope.copypastemove==5.0
zope.datetime==5.0.0
zope.deferredimport==5.0
zope.deprecation==5.0
zope.dottedname==6.0
zope.dublincore==5.0
zope.error==5.0
zope.event==5.0
zope.exceptions==5.1
zope.filerepresentation==6.0
zope.formlib==6.0
zope.generations==5.1.0
zope.hookable==6.0
zope.i18n==5.1
zope.i18nmessageid==6.1.0
zope.interface==7.0.3
zope.lifecycleevent==5.0
zope.location==5.0
zope.login==3.0
zope.minmax==2.3
zope.pagetemplate==5.1
zope.password==4.4
zope.pluggableauth==3.0
zope.principalannotation==5.0
zope.principalregistry==5.0
zope.processlifetime==3.0
zope.proxy==5.3
zope.ptresource==5.0
zope.publisher==7.0
zope.schema==7.0.1
zope.security==7.1
zope.securitypolicy==5.0
zope.server==4.0.2
zope.session==5.1
zope.site==5.0
zope.size==5.0
zope.tal==5.0.1
zope.tales==6.0
zope.traversing==5.0
zope.vocabularyregistry==1.2.0
zstandard==0.23.0