eXist-db / exist

eXist Native XML Database and Application Platform
https://exist-db.org
GNU Lesser General Public License v2.1
429 stars 179 forks source link

Journal Corruptions possible during concurrent transactions #2274

Open adamretter opened 6 years ago

adamretter commented 6 years ago

Even after all the recent work. It is still possible for corruptions to occur in the Journal from concurrent transactions.

This currently affects all versions of eXist-db and the same tests are present in each version branch.

This is a place holder to remind us of:

  1. https://github.com/eXist-db/exist/blob/develop/exist-core/src/test/java/org/exist/storage/AbstractRecoverTest.java#L454
  2. https://github.com/eXist-db/exist/blob/develop/exist-core/src/test/java/org/exist/storage/AbstractRecoverTest.java#L516
  3. https://github.com/eXist-db/exist/blob/develop/exist-core/src/test/java/org/exist/storage/journal/AbstractJournalTest.java#L768

We need to write concurrent transaction tests to show this problem, but it is not very easy at all to do. I spent some weeks experimenting with JCStress. Unfortunately, JCStress is not designed for working with heavy (e.g. slow) test operations (e.g. database operations). I am not sure how to write such tests.

Regardless, the solution I believe is to keep WRITE locks on Documents and Collections for the duration of a transaction. This will likely have a performance impact for eXist-db, but correctness is probably more important than performant data loss ;-)

alebastrov commented 5 years ago

If you are able to locate the place with an error, you may use Semaphore or CountDownLatch to make steps in right ordering