cinchapi / concourse

Distributed database warehouse for transactions, search and analytics across time.
http://concoursedb.com
Apache License 2.0
315 stars 234 forks source link

Server crash can cause an irrecoverable error where Revisions are not properly offset #442

Closed jtnelson closed 3 years ago

jtnelson commented 3 years ago

This is related to #441

Consider a scenario where the Buffer contains

ADD X at t1
REMOVE X at t2
ADD Y at t3
REMOVe Y at t4
ADD Y at t5

Assume both cpb and csb are written, but ctb causes the server to crash and the Buffer Page is not deleted.

When the server restarts, the Database will actually look like

ADD X at t1
REMOVE X at t2
ADD Y at t3
REMOVE Y at t4
ADD Y at t5

because the written cpb will be loaded and considered.

So, when the BufferTransportThread restarts, it will replay the write at t1, which will be accepted because the database considers X to not be contained. Afterwards, all the writes in the Buffer will be replayed and the database will look like

ADD X at t1
REMOVE X at t2
ADD Y at t3
REMOVE Y at t4
ADD Y at t5 *
ADD X at t1
REMOVE X at t2
ADD Y at t3 *
REMOVe Y at t4
ADD Y at t5

creating an issue where ADD Y appears consecutively and is not properly offset.