Closed adamreichold closed 6 months ago
Honestly, now I'm curious why I did it this way in the past. I suspect it was so that if you have two tasks trying to take a write, then they queue correctly and can await on the call to .write. What I want to avoid is the tasks repeatedly spinning on the .lock() until they know they "probably" will get the lock.
I'm wondering if we could achieve similar with some other construction while retaining the syncMutex. Perhaps something like:
loop {
if let Ok(guard) = self.lock() {
break guard;
} else {
// either a yield/yield_now, or some kind of wait on a queue or something?
}
}
We could probably use https://docs.rs/tokio/latest/tokio/sync/struct.Notify.html to achieve this. Finally the problem here is what happens if we drop the write without a commit? We need to ensure that we call notify on drop so that waiting tasks wake up.
I think the other possible issue here is then we have a mutex being held in the write txn over a possible await point which clippy can warn about.
Honestly, now I'm curious why I did it this way in the past. I suspect it was so that if you have two tasks trying to take a write, then they queue correctly and can await on the call to .write. What I want to avoid is the tasks repeatedly spinning on the .lock() until they know they "probably" will get the lock.
I think they will queue on the write
Mutex
which is still a fair/FIFO async mutex even after this change? I only modified the read side which should never have to wait for the writers to make progress.
And indeed the active
lock which became sync here is only held for short/limited/internal critical sections, once in read but only to clone an Arc
and once in commit
to publish the new generation.
But AFAIU, the part in commit will only ever fight against readers as only the single writer already having acquired write
will be committing at any point in time.
Honestly, now I'm curious why I did it this way in the past. I suspect it was so that if you have two tasks trying to take a write, then they queue correctly and can await on the call to .write. What I want to avoid is the tasks repeatedly spinning on the .lock() until they know they "probably" will get the lock.
I think they will queue on the
write
Mutex
which is still a fair/FIFO async mutex even after this change? I only modified the read side which should never have to wait for the writers to make progress.
Ahh true. I made the mistake of assuming you were changing the write side too.
And indeed the
active
lock which became sync here is only held for short/limited/internal critical sections, once in read but only to clone anArc
and once incommit
to publish the new generation.But AFAIU, the part in commit will only ever fight against readers as only the single writer already having acquired
write
will be committing at any point in time.
Yep, and considering in those cases the whole process doesn't block or await, we should be good. Let me check the concerns in the test case and we go from there.
Given this does change the behaviour to remove the async I think we will need a major version bump of the lib before we release it.
Given this does change the behaviour to remove the async I think we will need a major version bump of the lib before we release it.
Indeed this is a semver-breaking change. One could play games with keep read
async (without actually awaiting anything) and adding a read_sync
variant for the actual implementation, but considering that the downstream changes to adjust to this should be as trivial as removing .await
, it do not think this is worth it and would also just go for the breaking release, especially since concread
is still a pre-1.0 crate.
Also while testing this in our code base using a Git dependency, I found one paper cut, the fix of which is #112 ensuring that migration is painless. (For anyone like me already using Serde support in 0.4.5 and 0.4.6.)
@adamreichold Okay, I'll upgrade to the git release in some other projects that heavily rely on this crate to check for any other issues before we go ahead and do a major release.
@adamreichold Okay, tested with no issues. So I'm happy to go ahead and do a major release provided there aren't any other major API changes you want to make :)
I also hope to get our downstream usage reviewed and merged using a Git dependency tomorrow and would deploy it to our staging environment for a few days.
So I do not have any additional changes planned but I could also offer some production-like testing to increase our confidence in shipping a 0.5.0 release next week? But if you already are sufficiently confident, I certainly won't mind using a crates.io dependency either.
Next week sounds great to me. I'm already very confident having reviewed the code, and the test cases are very exhaustive so if they passed were already in a good spot. I'll wait for your production-like tests to do 0.5.0 next week then.
Just wanted to say that the new code has been running since the middle of last week without issue for us. I will keep it running until 0.5.0 is released and speak up if anything unexpected happens, but for now everything looks good to me.
Thank you! Release made!
Using MVCC, readers never have to wait for writers and hence opening a read transaction never blocks. We should therefore be able to make
LinCowCell::read
sync by using aSyncMutex
for itsactive
field. This then implies thatLinCowCellWriteTxn::commit
can also become sync and onyl the potentially blockingLinCowCell::write
has to stay async so that one writer can yield to others.This does imply a change to the
test_concurrent_create
test case as the readers are now fully blocking code that never yields to the executor and the first one will hence starve all other readers and the writers so that the test case never finishes. This is mitigating by usingspawn_blocking
for these actually blocking tasks. Alternatively, they could be forced to yield usingyield_now
or spawned onto a multi-threaded runtimes with at least one more worker thread than reader tasks.