Closed aboodman closed 3 years ago
we are not serializing all transactions
we had a test that verified sequential execution for write transactions, which i improved and it still passes. we did not have a test for parallel execution for read transactions so I added one and it passes with no code changes. both of these in #270 . so looks like we are good on this front.
there are however a few things i want to understand better before i close this issue out, including at least three levels of locking in connection.rs. will post what i find here when i pick it back up next week.
Before closing this issue out I wanted to rationalize the locking in connection. I read the code pretty closely and here's what I think I have learned. In the below by "operation" or "op" I mean an rpc called within a transaction, eg put, get, has, etc.
Desired properties:
In connection there are three levels of locking in play. The transaction map looks like this:
enum Transaction<'a> {
Read(db::OwnedRead<'a>),
Write(db::Write<'a>),
}
TxnMap<'a> = RwLock<HashMap<u32, RwLock<Transaction<'a>>>>;
The lets call the outer lock around the hashmap the "map lock". Then we have "transaction locks" around individual Transactions in the map. And a "db lock" sits inside each Transaction.
The things I wanted to understand are whether all these locks are necessary (i think: yes), what desired properties listed above that each provides (see below), and whether there are any unintended or less than ideal consequences from how they work (maybe).
Simplifying the code so we can see interplay of locking, here's how it works:
let map: RwLock<HashMap<u32, RwLock<Transaction>>>;
fn open_transaction(r: request) {
let tx = if is_read_tx(r) {
let db_read = store.read();
Transaction::new(db_read)
} else if is_write_tx(r) {
let db_write = store.write();
Transaction::new(db_write)
}
let map_write = map.write();
map_write.insert(tx_id, RwLock::new(tx));
}
The db lock used in open_transaction gives us desired property A (write tx is exclusive of all others). We have discussed this before, but given that these mutexes are not fair, a write could wait arbitrarily long to start as long as read txs keep coming in. That is, if a read tx is open when we try to open a write tx the write waits. But more reads could continue get opened; the write just keeps waiting until there is nothing open. Call this potential issue 1. Further, if two write txs come in while a read is open, either one might start first when reads are finished. Call this potential issue 2.
Read and write ops work like this:
fn read_op(r: request) {
let map_read = map.read();
let tx = map_read.get(r.tx_id);
let tx_read = tx.read();
do_work(tx_read);
// note: map_read and tx_read held through do_work and dropped here
}
fn write_op(r: request) {
let map_read = map.read();
let tx = map_read.get(r.tx_id);
let tx_write = tx.write();
do_work(tx_write);
// note: map_read and tx_write held through do_work and dropped here
}
In the above the transaction locks (tx_read, tx_write) give us desired properties B and C (within a transaction write ops are exclusive, read ops can execute concurrently).
The map lock hasn't so far come into play with our desired properties, so far it is just there to make concurrent access to the transaction map safe. There is an unfortunate interplay though: notice in read_op and write_op that the map read lock is held through doing the work. Suppose there are open read txs. Then a new read transaction cannot be opened until there are exactly zero read ops running. This is because open_transaction needs a write lock on the map, but read_op holds a read lock until it returns, so lots of overlapping read ops prevent a new read transaction from opening even though strictly speaking it should be able to. Call this potential issue 3.
Closing transactions works like this:
fn do_commit(r: request) {
let map_write = map.write();
let tx = map_write.remove(r.txn_id);
// Note: transaction lock not held
tx.commit();
}
fn do_close(r: request) {
let map_write = map.write();
let tx = map_write.remove(r.txn_id);
// Note: transaction lock not held
}
Note that we do not hold the transaction lock when drop the transaction at the end of each function. In theory there could outstanding read or write ops and we are destroying the transaction they are executing within before they are finished. In practice there are not: do_close and do_commit acquire the map write lock before proceeding and read_op and write_op hold a map read lock through their entire execution, which prevents do_close/do_commit from starting until they are done. Nothing can be borrowed from the map when we get the map write lock. However if we changed the lifetime of the map read lock in read_op/write_op to be more shortlived say to address issue 3 above, then this could maybe lead to problems. Seems like to be safe we should acquire a transaction write lock here to ensure all other ops are complete. Call this potential issue 4.
Ok, what now. I think with respect to the things learned above:
&RWLock
), preventing the map read guard from being dropped. It can be fixed by slapping a Arc around the transaction lock so the map becomes a RwLock<HashMap<u32, Arc<RwLock<Transaction<'a>>>>>
. This enables us to work around the fact that map_read.get() borrows from the map (we get a &Arc<RwLock>
, clone it, and drop the original reference).Anyone think I am reading this wrong, or think we should do something other than above?
Additional thoughts:
The logic has been moved to JS and all theses issues are resolved as far as a I can tell
https://github.com/rocicorp/repc/pull/203/files/0c7353a5f4f98d30049afd3af5d4c335f14e03f2..c27b20550acf043e9b1dd14ebc3754613eba0f8f#diff-538fa79214eb88d59c2af080b7f78d197b0c40f287f2b4b0f0114e4cbb91b9f7