tarantool / tarantool

Get your data in RAM. Get compute close to data. Enjoy the performance.
https://www.tarantool.io
Other
3.39k stars 379 forks source link

vinyl: vy_tx_write doesn't handle memory errors #1842

Closed rtsisyk closed 7 years ago

rtsisyk commented 7 years ago
static struct txv *
vy_tx_write(write_set_t *write_set, struct txv *v, uint64_t time,
         enum vinyl_status status, int64_t lsn)
         [CUT]

        for (; v && v->index == index; v = write_set_next(write_set, v)) {
                [CUT]

                /*
                 * Insert the new statement to the range's
                 * in-memory index.
                 */
                struct vy_mem *mem = range->mem;
                int rc = vy_mem_tree_insert(&mem->tree, stmt, NULL); <!-- OOM
                assert(rc == 0); /* TODO: handle BPS tree errors properly */
                (void) rc;
                vy_stmt_ref(stmt);

vy_tx_write is executed AFTER wal_write(), therefore it is not possible to rollback this transaction. On the other hand, vy_mem_tree is indexed by (key, lsn) pair and LSN is assigned by WAL writer, so there is not way to update vy_mem_tree BEFORE wal_write().

We agreed we will move vy_tx_write() to vy_prepare(), and call before wal_write(). We will use LSN_INFINITY when putting the tuple into mem. Collateral problems:

rtsisyk commented 7 years ago

Blocked by #1908 and #1572 as @kostja said.

rtsisyk commented 7 years ago

Discussed today. Moving to 1.7.5.

alyapunov commented 7 years ago

We also need to support read_views between two prepared, but not committed transactions. That ruins the idea of using the same LSN_INFINITY for all tuple

alyapunov commented 7 years ago

The steps are:

  1. Apply locker's patch 'freeze mems' or however it is called.
  2. Make possible to delete records from mems (the main trouble is mem_iterator).
  3. Rename tsn member of vy_tx and vy_xm to tx_id.
  4. Add prepare_sn member to vy_tx; assign it in tx during prepare to the next incremented value.
  5. Create a linked list of prepared transactions on order by preparing time (i.e. by prepare_sn).
  6. Enable collecting of a read set for conflicted txs. 6.1. Create a list of directly depended transactions in a transaction. The list must contain conflicted transactions which has a visibility of the transaction but doesn't have a visibility of the next later transaction. 6.2. on conflict (during prepare), in addition to setting vlsn of the conflicted tx to lsn - 1, try to find previous prepared tx, and (if it exists), add the conflicted tx into the directly-dependent-list of the found tx. 6.3. on commit of a transaction, traverse trough it's directly-dependent-list, an set all vlsns as lsn that was provided by WAL. 6.4. on abort of a prepared transaction, abort all it's directly-dependent txs. It is also possible to rebind it to the previous prepared tx, if there is no intersection in the read set of the dependent tx and the write set of the aborting tx.
  7. Store replace_stmt and delete_stmt pointer (replace/delete variables in vy_commit) in txv. Implement rollback of prepared tx in the same order as now vy_commit does.
  8. Think about deconflicting some txs on about of the cause of the conflict.
alyapunov commented 7 years ago

2nd part of 6.4 must go to another ticket.