Modernize race tests and add test that catches ZEO corruption on invalidation / disconnect

navytux commented 2 years ago

Add test for Bug2 described in https://github.com/zopefoundation/ZEO/issues/209 + modernize race testing infrastructure. We add this test here, so that it can be run automatically for all storages - e.g. both ZEO and NEO.

ZEO currently can corrupt data if client disconnects simultaneously to another client performing tpc_finish. It fails e.g. as follows:

(z-dev) kirr@deca:~/src/wendelin/z/ZEO5$ ZEO_MTACCEPTOR=1 zope-testrunner -fvvvx --test-path=src -t check_race_external_invalidate_vs_disconnect
/home/kirr/src/wendelin/venv/z-dev/bin/zope-testrunner traceio=True
/home/kirr/src/wendelin/z/ZEO5/src/ZEO/StorageServer.py:51: DeprecationWarning: The mtacceptor module is deprecated and will be removed in ZEO version 6.
  'in ZEO version 6.', DeprecationWarning)
Running tests at level 1
Running .BlobAdaptedFileStorageTests tests:
  Set up .BlobAdaptedFileStorageTests in 0.000 seconds.
  Running:
 check_race_external_invalidate_vs_disconnect (ZEO.tests.testZEO.BlobAdaptedFileStorageTests) (1.889 s)

Failure in test check_race_external_invalidate_vs_disconnect (ZEO.tests.testZEO.BlobAdaptedFileStorageTests)
Traceback (most recent call last):
  File "/usr/lib/python2.7/unittest/case.py", line 329, in run
    testMethod()
  File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/tests/racetest.py", line 357, in check_race_external_invalidate_vs_disconnect
    T2ObjectsInc2Phase())
  File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/tests/util.py", line 400, in _
    return f(*argv, **kw)
  File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/tests/racetest.py", line 446, in _check_race_xxx_vs_external_disconnect
    self.fail('\n\n'.join([_ for _ in failure if _]))
  File "/usr/lib/python2.7/unittest/case.py", line 410, in fail
    raise self.failureException(msg)
AssertionError: T15: obj1 (6) - obj2(4) != phase (1)
obj1._p_serial: 0x03ea4cc505486777  obj2._p_serial: 0x03ea4cc503413799  phase._p_serial: 0x03ea4cc505486777
zconn_at: 0x03ea4cc505486777  # approximated as max(serials)
zstor.loadBefore(obj1, @zconn.at)       ->  serial: 0x03ea4cc505486777  next_serial: None
zstor.loadBefore(obj2, @zconn.at)       ->  serial: 0x03ea4cc504a74099  next_serial: None
zstor.loadBefore(phase, @zconn.at)      ->  serial: 0x03ea4cc505486777  next_serial: None
zstor._cache.clear()
zstor.loadBefore(obj1, @zconn.at)       ->  serial: 0x03ea4cc505486777  next_serial: None
zstor.loadBefore(obj2, @zconn.at)       ->  serial: 0x03ea4cc504a74099  next_serial: 0x03ea4cc506104155
zstor.loadBefore(phase, @zconn.at)      ->  serial: 0x03ea4cc505486777  next_serial: 0x03ea4cc506104155

T51: obj1 (6) - obj2(4) != phase (1)
obj1._p_serial: 0x03ea4cc505486777  obj2._p_serial: 0x03ea4cc503413799  phase._p_serial: 0x03ea4cc505486777
zconn_at: 0x03ea4cc505486777  # approximated as max(serials)
zstor.loadBefore(obj1, @zconn.at)       ->  serial: 0x03ea4cc505486777  next_serial: None
zstor.loadBefore(obj2, @zconn.at)       ->  serial: 0x03ea4cc503413799  next_serial: None
zstor.loadBefore(phase, @zconn.at)      ->  serial: 0x03ea4cc505486777  next_serial: None
zstor._cache.clear()
zstor.loadBefore(obj1, @zconn.at)       ->  serial: 0x03ea4cc505486777  next_serial: None
zstor.loadBefore(obj2, @zconn.at)       ->  serial: 0x03ea4cc504a74099  next_serial: None
zstor.loadBefore(phase, @zconn.at)      ->  serial: 0x03ea4cc505486777  next_serial: None

Please see individual patches for details.

navytux commented 2 years ago

@d-maurer, with https://github.com/zopefoundation/ZODB/pull/368/commits/6f5703c83769c4103c13272c9de87f8ba7362706 I've tried to address the rest of your feedback and incorporate some of your suggestions.

Would you please have another look?

Thanks beforehand, Kirill

d-maurer commented 2 years ago

Kirill Smelkov wrote at 2022-8-31 08:13 -0700:

... Do you think that it will be better e.g. as
   # verify accesses objects in the database and verifies spec invariant.
   ...

In my feeling, it is not very good english. What about 'verify object access and spec invariant orverify spec invariant under concurrent modifications`?

d-maurer commented 2 years ago

Kirill Smelkov wrote at 2022-8-31 08:17 -0700:

...

modify changes objects in the database by executing "next" step.

#

Spec invariant should be preserved.

modify and change are synonyms.

What about modify objects in the database by executing "next"; verify the spec invariant.

d-maurer commented 2 years ago

Kirill Smelkov wrote at 2022-8-31 08:53 -0700:

Like I've tried to explain in https://github.com/zopefoundation/ZODB/pull/368#discussion_r959707435 my idea is to keep models minimal. Models just define ways for database state to be evolved and to assert that intended invariant is held. Everything else - to verify model behaviour on particular usage scenario - is the job of the model checker.

In my opinion, moving the verification logic into the model class would give a more modular design. But you are the author... I do not want to impose my opinion on you.

navytux commented 2 years ago

@d-maurer, thanks for feedback. Regarding verify and modify - those comments are build up as

# <function-name> does something.
def <function-name>:
    ...

Is that maybe what confused you in e.g.

# modify changes objects in the database ...
def modify():
    ...

?

In those places we are only defining functions, not actually executing them - see e.g. here:

https://github.com/zopefoundation/ZODB/blob/6f5703c83769c4103c13272c9de87f8ba7362706/src/ZODB/tests/racetest.py#L180-L192

That's why e.g.

modify objects in the database by executing "next";
verify the spec invariant`.

before def modify might be misleading (I would read it as if we actually modify objects right below).

Said that I understand that there might be various approaches to comment styles and I agree to follow established practice in ZODB if there is such. But before switching I prefer to make sure there is really no confusion here.

Also thanks for being flexible regarding what comes to a model and what lives in a checker. I appreciate that.

navytux commented 2 years ago

P.S. in https://github.com/zopefoundation/ZODB/runs/8117890088?check_suite_focus=true tests failed with

ERROR: InvocationError for command /home/runner/work/ZODB/ZODB/.tox/py38-pure/bin/zope-testrunner --test-path=src --all -vc (exited with code -11 (SIGSEGV)) (exited with code -11)

which is suspicious given that it was pure-python run. But besides a bug in the software somewhere, it might be also due to e.g. bad RAM on the runner machine.

d-maurer commented 2 years ago

Kirill Smelkov wrote at 2022-8-31 10:06 -0700:

... Is that maybe what confused you in e.g.

modify changes objects in the database ...

def modify():

Indeed, I did not recognize that the comment should describe what the function should do.

Maybe, you could use quotes to make this clearer (e.g. "modify" changes objects in the database) or use a modify docstring (e.g.

def modify():
  """changes objects in the database."""

) or omit it completely (as the function name is speaking).

navytux commented 2 years ago

@d-maurer, thanks for feedback. I used not to include function name into its docstring (as "function name is speaking"), but later in my carrier switched to the style where function name is included into comments. Let's have the name quoted following your first suggestion. I've tried to do it with " but found the result more appealing (to me, so subjectively) with single backqoute.

Would you please see whether https://github.com/zopefoundation/ZODB/pull/368/commits/12c1628e7ca2c5e30b28fd936e1618ad2eb15ae6 make things more clear?

navytux commented 2 years ago

@d-maurer, thanks for the approval. I've merged this PR.

I had to mark all our conversations as resolved, because without that github was blocking the merge insisting that "all conversations must be resolved before merging". I believe this is something new, and maybe not exactly what we want. There should be a corresponding setting in https://github.com/zopefoundation/ZODB/settings/branches , but I do not have access to ZODB.git settings page to check. The setting should be named as "Require conversation resolution before merging".

In my subjective view, requiring conversations to be resolved is ok, but the side effect of them becoming collapsed by default is not good. So maybe there is something to reconsider here.

Thanks again for your review and approval.

Kirill

icemac commented 2 years ago

@navytux I (as administrator) activate Require conversation resolution before merging in all zopefoundation repositories I come across. The idea behind is that the discussion about the PR should have been resolved before the merge so all participants came to the conclusion that the PR is ready to be merged. That resolved discussions are collapsed is a decision of GitHub. I have no problem that they are hidden by default but if it annoys you I can disable this setting for ZODB and ZEO. (Hopefully remembering this discussion in the next months so I do not reactivate it again.)

What you you think?

navytux commented 2 years ago

@icemac thanks for feedback. In my view some discussions archives are useful to be visible in the long run - for example they sometimes contain useful information about why a particular decision has been made.

Ideally all such information should be present in either code itself and/or appropriate commit/merge messages. But we are very far from that ideal, and practically I do not see it might be improved to the level where PR discussions do not have any value after a PR is merged. What we usually have is that a merge commit only references corresponding pull-request, and an interested reader (= e.g. developer trying to understand a context or a reason for particular change) goes to PR page on github to investigate it further.

Then, with displayed discussions, the information is immediately there. However with collapsed discussions, it is practically hard to find, or even notice, what might be important to see. Thus having discussions collapsed is inconvenient in my view.

I see it like this: having discussions collapsed helps in the short-term: it helps to immediately see what concerns remain unaddressed. However, as I tried to explain above, in the long-term it does not really help - quote the contrary. So, in my view, what to do should be decided based on whether short or long term is more important. In my view the long-term effect is significantly more important compared to its short-term counterpart. Thus I would suggest to have discussion not collapsed, at least for ZODB-stack repositories.

Please also note, that e.g. in current PR, it was not a problem for Dieter to actually approve it, after all concerns, that were raised, were either addressed with fixups, or commented back. So when @d-maurer approved this PR, all discussions were in "unresolved" state. It was only me to "resolve" them after the approval for the sole reason of github blocking merge in the "unresolved" state.

Please do not get me wrong: I do not insist on switching discussion back into unresolved state if people find it inconvenient. But if there is no strong preference, I believe, as I tried to explain above, that in the long-term, having discussions not collapsed is better.

Kirill

icemac commented 2 years ago

@navytux I disabled Require conversation resolution before merging here and in ZEO.

navytux commented 2 years ago

Thanks, @icemac.

zopefoundation / ZODB

Modernize race tests and add test that catches ZEO corruption on invalidation / disconnect #368

modify changes objects in the database by executing "next" step.

Spec invariant should be preserved.

modify changes objects in the database ...