pmwkaa / sophia

Modern transactional key-value/row storage library.
http://sophia.systems
Other
1.85k stars 153 forks source link

Serialized Snapshot Isolation (SSI) anomalies #164

Closed BrunoBonacci closed 6 years ago

BrunoBonacci commented 6 years ago

In sophia homepage, one of the indicated features is: Serialized Snapshot Isolation (SSI) consistency level.

In some test have experienced some anomalies which shouldn't be allowed under SSI.

The definition of SSI from https://wiki.postgresql.org/wiki/SSI is:

With true serializable transactions, if you can show that your transaction will do the right thing if there are no concurrent transactions, it will do the right thing in any mix of serializable transactions or be rolled back with an error.

[...] problems which can occur with certain combinations of transactions at the REPEATABLE READ transaction isolation level, [...] are avoided at the SERIALIZABLE transaction isolation level.

Wikipedia says about Snapshot isolation

A transaction executing under snapshot isolation appears to operate on a personal snapshot of the database, taken at the start of the transaction. When the transaction concludes, it will successfully commit only if the values updated by the transaction have not been changed externally since the snapshot was taken.

It implies that in a SSI transaction not only you can Read Your Own Writes, but reads should be Repeatable.

This is also reiterated here: Serializable Isolation for Snapshot Databases paper.

Now it appear that this guarantee is violated in the following test:

Assuming we have a db with two keys A and B both of value 5. The invariant is: The sum of A + B should be always 10. Nobody should be able to observe a invalid state.

Time Tx1 Tx2 Note
Initial state
A=5, B=5
------ ------------- --------------- --------------------
t0 BEGIN TX
t1 READ(A)=5
t2 BEGIN TX
t3 READ(A)=5
READ(B)=5
t4 WRITE(A)=6
WRITE(B)=4
t5 COMMIT = OK
t6 READ(B)=4 INVALID! A+B = 9

However, it is say, that any attempt to write and commit on Tx1 will fail, however a read-only transaction can still observe an invalid state (and potentially side effecting the wrong info).

Doing further test I've concluded that the this is due to the fact that READ REPEATABLE is not provided. For example:

Time Tx1 Tx2 Note
Initial state
A=5, B=5
------ ------------- --------------- --------------------
t1 READ(A)=5
t2 BEGIN TX
t3 READ(A)=5
WRITE(A)=6
t5 COMMIT = OK
t6 READ(A)=6 INVALID! A changed

From this test we can observe that reads are NOT repeatable. This seams to happen for UPDATES and DELETES but not for INSERTS.

In a Snapshot isolation level Repeatable reads must be guaranteed. This is very important to guarantee that side effecting transaction will see a consistent view of the world.

As result of this test it seems that the ACTUAL consistency level provided by Sophia (v2.2) is READ COMMITTED with Read-Your-Own-Writes.

pmwkaa commented 6 years ago

Do you have an example in Sophia showing that behaviour?

t6 READ(B)=4 -- This simply should not happen, because Tx1 should only see versions starting from t0 and nothing else (because of snapshot isolation). Difference with SSI and SI is that SSI treats any read operations as write.

Have you tried Hermitage tests: https://github.com/pmwkaa/hermitage. They included as part of test suite.

BrunoBonacci commented 6 years ago

I've tweaked your transaction.c example. Here is what i've got:

#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <assert.h>

#include <sophia.h>

int main(int argc, char *argv[])
{
        (void)argc;
        (void)argv;

        /* open or create environment and database */
        printf("creating db\n");
        void *env = sp_env();
        sp_setstring(env, "sophia.path", "_test", 0);
        sp_setstring(env, "db", "test", 0);
        int rc = sp_open(env);
        if (rc == -1)
                goto error;

        void *db = sp_getobject(env, "db.test");

        /* set up */
        printf("setting initial value: key1 = value1\n");
        void *o = sp_document(db);
        sp_setstring(o, "key", "key1", 0);
        sp_setstring(o, "value", "value1", 0);
        rc = sp_set(db, o);
        if (rc == -1)
                goto error;

        /* begin transaction TX1*/
        printf("begin transaction TX1\n");
        void *tx1 = sp_begin(env);

        /* READ KEY1 */
        o = sp_document(db);
        sp_setstring(o, "key", "key1", 0);
        o = sp_get(tx1, o);
        if (o) {
            int size;
            char *ptr = sp_getstring(o, "value", &size);
            printf("reading value (TX1): key1 = %s\n", ptr);

            sp_destroy(o);
        }

        /* ----------------------------- NEW TX2 ----------------------------- */
        printf("\tbegin transaction TX2\n");
        void *tx2 = sp_begin(env);

        printf("\tsetting new value (TX2): key1 = value2\n");
        o = sp_document(db);
        sp_setstring(o, "key", "key1", 0);
        sp_setstring(o, "value", "value2", 0);
        rc = sp_set(tx2, o);
        if (rc == -1)
            goto error;

        printf("\tcommit transaction TX2\n");
        rc = sp_commit(tx2);
        if (rc == -1)
            goto error;
        /* ---------------------------- END OF TX2 --------------------------- */

        /* get key1 again tx1*/
        o = sp_document(db);
        sp_setstring(o, "key", "key1", 0);
        o = sp_get(tx1, o);
        if (o) {
            int size;
            char *ptr = sp_getstring(o, "value", &size);
            printf("reading value (TX1): key1 = %s\n", ptr);

            sp_destroy(o);
        }

        /* commit transaction */
        rc = sp_commit(tx1);
        if (rc == -1)
                goto error;

        /* finish work */
        sp_destroy(env);
        return 0;

error:;
        int size;
        char *error = sp_getstring(env, "sophia.error", &size);
        printf("error: %s\n", error);
        free(error);
        sp_destroy(env);
        return 1;
}

output:

creating db
setting initial value: key1 = value1
begin transaction TX1
reading value (TX1): key1 = value1
      begin transaction TX2
      setting new value (TX2): key1 = value2
      commit transaction TX2
reading value (TX1): key1 = value2
pmwkaa commented 6 years ago

I've written test by using operations order that you mentioned in the first message: https://github.com/pmwkaa/sophia/commit/48beea58d54729f8037c7fbc249e058a974d8872

pmwkaa commented 6 years ago

Thank you for the example, it helped to find the bug. For some bizarre reason I have not properly update pointer to correct document version during in-memory search. I wonder why tests not reveal it, because it seems quite obvious.

I have added this example to the test suite also: https://github.com/pmwkaa/sophia/commit/669d57ba91bcb3593d8528704b2d790a97ffadd6

BrunoBonacci commented 6 years ago

Great, I confirm I see the right behaviour now. thx Are you planning to cut a release?