Closed viktordick closed 3 years ago
Another error in the tests than sometimes happens:
===================================================== FAILURES ======================================================
_____________________________________________ TestSync.test_watch_move ______________________________________________
self = <perfact.zodbsync.tests.test_sync.TestSync object at 0x7f66180bcbe0>
conn = <perfact.zodbsync.helpers.Namespace object at 0x7f66180a3dc0>
def test_watch_move(self, conn):
"""
Create a Page Template, record it using the watcher, rename it and make
sure the watcher notices. Then add a second one and do a
three-way-rename in one transaction, making sure the watcher keeps
track.
"""
watcher = self.runner('watch')
watcher.setup()
root = self.repo.path + '/__root__/'
src = '/__source-utf8__.html'
app = conn.app
add = app.manage_addProduct['PageTemplates'].manage_addPageTemplate
rename = app.manage_renameObject
with conn.tm:
add(id='test1', text='test1')
watcher.step()
assert os.path.isdir(root + 'test1')
# Not sure how to apply this specifically to the secondary connection
# and why it is only needed for the rename and not the adding, but it
# seems to do the job
userfolder = conn.app.acl_users
user = userfolder.getUser('perfact').__of__(userfolder)
newSecurityManager(None, user)
with conn.tm:
rename('test1', 'test2')
watcher.step()
assert os.path.isdir(root + 'test2')
assert not os.path.isdir(root + 'test1')
with conn.tm:
add(id='test1', text='test2')
watcher.step()
> assert os.path.isdir(root + 'test1')
E AssertionError: assert False
E + where False = <function isdir at 0x7f661bf42160>(('/tmp/tmp6tuvwh1b/__root__/' + 'test1'))
E + where <function isdir at 0x7f661bf42160> = <module 'posixpath' from '/usr/lib/python3.9/posixpath.py'>.isdir
E + where <module 'posixpath' from '/usr/lib/python3.9/posixpath.py'> = os.path
perfact/zodbsync/tests/test_sync.py:395: AssertionError
Some new information. Some of the failures are simply a timing issue (after commit on one connection, I can not expect that the other connection sees the result immediately). This I could fix by retrying with delay at the corresponding positions (PR will be created soon). This reduced the number of failures and the remaining situations are those that actually run into the failure mode - even with all following steps, the change is not noticed.
I then added a debugging trace to Watch._read_changed_oids
to check which transaction ids are actually checked and a breakpoint at the point where the actual failure occurs.
The function was called with the following arguments (base64-encoded):
[(b'A99w2L4NU80=', b'A99w2L6Xosw='), (b'A99w2L6Xos0=', b'A99w2L6kpGY=')]
The undoInfo
contains the following entries:
[{'description': '',
'id': b'A99w2L6kpGY=',
'size': 1348,
'time': 1618667084.6819844,
'user_name': ''},
{'description': '',
'id': b'A99w2L6Xosw=',
'size': 1338,
'time': 1618667084.6700768,
'user_name': ''},
{'description': '',
'id': b'A99w2L4NU8w=',
'size': 397,
'time': 1618667084.5434513,
'user_name': ''},
{'description': 'Added default view for root object',
'id': b'A99w2L4J/7s=',
'size': 1676,
'time': 1618667084.5404043,
'user_name': ''},
{'description': 'Added virtual_hosting',
'id': b'A99w2L3pKt0=',
'size': 829,
'time': 1618667084.510346,
'user_name': ''},
{'description': '',
'id': b'A99w2L3ez4g=',
'size': 717,
'time': 1618667084.500864,
'user_name': ''},
{'description': 'initial database creation',
'id': b'A99w2LzemyI=',
'size': 154,
'time': 1618667084.2663016,
'user_name': ''}]
This is at test_watch_move
after renaming from test1
to test2
. watcher.object_tree
still contains test1
while watcher.app.objectIds()
contains test2
.
This means: There was a call to _read_changed_oids
that checked the transaction A99w2L6kpGY=
and should have seen the rename, but it did not see it.
The bug seems to be in _set_last_visible_txn
. It yields a transaction ID that is not actually already visible.
Another clue:
This time, it failed after creating /test1
and not finding it on the disk:
Found 2 changed objects
Found 2 changed objects
Found 2 changed objects
OIDs: [b'\x00\x00\x00\x00\x00\x00\x00\x01', b'\x00\x00\x00\x00\x00\x00\x00e']
OIDs: [b'\x00\x00\x00\x00\x00\x00\x00\x01', b'\x00\x00\x00\x00\x00\x00\x00e']
OIDs: [b'\x00\x00\x00\x00\x00\x00\x00\x01', b'\x00\x00\x00\x00\x00\x00\x00e']
Recording /
Recording /
Recording /
The OID
with \x00e
at the end is indeed that of /test1
. Somehow the watcher saw the object but only recorded /
!
All right, I think I found the origin of the problem:
import transaction
import ZEO
class Conn():
def __init__(self):
self.tm = transaction.TransactionManager()
self.db = ZEO.DB('zeo.sock')
self.conn = self.db.open(self.tm)
self.app = self.conn.root.Application
c1 = Conn()
c2 = Conn()
# Clean up property if it exists
with c1.tm:
if c1.app.hasProperty('testprop'):
c1.app.manage_delProperties(ids=['testprop'])
# Open a transaction to try and see a consistent snapshot
c1.tm.begin()
# We see the current transaction
print(c1.db.lastTransaction())
# We see that the property is not set
print(c1.app.getProperty('testprop'))
# Using the second connection, set the property and commit
with c2.tm:
c2.app.manage_addProperty('testprop', 'test', 'string')
# We do not see the property on the first connection - we are in the middle
# of an older transaction
print(c1.app.getProperty('testprop'))
# BUT lastTransaction() is not the transaction we are looking at, it is the
# newer one created by c2
print(c1.db.lastTransaction())
c1.tm.abort()
with c1.tm:
print(c1.app.getProperty('testprop'))
The code in watch.py
does not use lastTransaction()
but undoInfo()
, but the problem is the same. This does not give us the transaction we are actually currently looking at, it gives us the currently newest transaction that has been commited on the ZEO server - which might be newer.
It seems there is another way to access the transaction number, one that actually shows the transaction number we are looking at - app._p_serial
. This might be the fix.
Unfortunately, this does not work. app._p_serial
is not changed for new transactions unless these actually touch the app
object.
This is a hard to reproduce error, as it occurs only sporadically. I have not been able to pin down the exact circumstances under which it occurs.
As far as I have been able to understand, the wrong behavior is as follows:
zodbsync pick
to pick some commits, which modifies some objects as well as adding some. The changed or added objects are played back into the ZODB.The correct new state of the objects is present in Zope, but not on the file system. Further sweeps do not pick up any additional transactions and do not correct this.
This should not happen. The watcher should always see the objects in the state after the transaction currently stored in
self.last_visible_txn
.This might be related to the watcher tests failing sporadically, which is also hard to pin down. One example traceback is: