kanidm / concread

Concurrently Readable Data Structures for Rust
Mozilla Public License 2.0
339 stars 15 forks source link

Why are write transactions abort by default? #87

Closed axos88 closed 1 year ago

axos88 commented 1 year ago

Seems very counter intuitive and in contrast to how traditional concurrency data structures work. It is also seems rare to want to abort a txn without committing it, it usually happens due to an error condition, and not part of normal code flow.

I'd find it much more intuitive if the write txn would commit on drop, and have a method for aborting.

Firstyear commented 1 year ago

In good systems design, you the developer should be explicitly aware of transaction lifecycles and their presence, including precisely when they are committed. Commit by default leads to developers "forgetting" and accidentally committing when they forgot to abort. Where as abort by default will never lead to accidentally committing invalid or partial states without the developers consent and awareness. I have seen "developer laziness" around transactions cause more than enough damage in my lifetime.

Within Rust, drop is not guaranteed to run. This is a pretty hairy and complex topic, but within the context of these structures, not-dropping only causes a memory leak but your application will continue correctly. Contrast, if drop didn't run and you "assumed" a commit would occur, then you would effectively deadlock due to the pin on the write context that could now no longer be freed.

Commit as an explicit action lends it self far better to the memory model and design within rust with how ownership works. Commit takes (mut self) where drop takes &mut self. Since we have to move out of the guards that exist in the transactions, doing this in drop is harder since all the values of the transaction are "bound together" where in commit we can destructure them and precisely order the commit phase.

Finally, just because something is "convenient" does not make it correct. Nor does "this is how it's always been done" serve as a valid justification to continue bad behaviours. My aim is to make datastructures that you can't hold wrong, and I strongly believe the current "ask for commit, abort by default" is correct on many levels.

axos88 commented 1 year ago

Not sure what prompted the heated response from your part, but I'll try to avoid doing the same.

In an ideal world I would agree with you. In that world we would have compile time support for linear types in rust that cannot be simply dropped, and it would be possible to require the user to explicitly commit or abort a transaction.

Unfortunately we are very far from that, and it seems linear types are not even being actively considered due to the insane amount of shitstorm - pardon the expression - it would result in if implemented.

I'm curious on what circumstances you mean by drop not being guaranteed to run that are not results of undefined behavior, as far as I know rust's main selling point is exactly about guaranteeing that drop is run, that it is run only once, and also the ability to figure out exactly WHEN it is being run. The only circumstances I can think of when drop may not run are not important from an stm's perspective: aborts or mem:forgetting the transaction guard.

The former ends the program anyways, the later is a developer error.

In the real world, where there is no compiler support for linear types, they can be simulated during run time by panicking in the implementation of Drop but that is an eh solution. In this world I don't think that either commit by default nor abort by default are good assumptions, with the latter being more problematic - hence the issue I raised.

During the normal flow of a program optimistically we expect all transactions to be committed. We check for erroneous situations and we act on them if necessary. These checks are rarely formulated as "if everything okay do this", and much more often "did this bad thing happen? If so do this, otherwise continue on". So aborting the transaction has a natural place in the conditionals, while the commit not necessarily.

I understand your point about defensive programming and trying to avoid misuse of the library, but is introducing a behavior that is opposite of what usually happens in the stdlib the correct way to do this? If you really want to avoid having commit by default, i would at least panic on drop, because it is surprising behavior to automatically abort transactions without an actual erroneous situation.

Firstyear commented 1 year ago

I'm curious on what circumstances you mean by drop not being guaranteed to run that are not results of undefined behavior, as far as I know rust's main selling point is exactly about guaranteeing that drop is run, that it is run only once, and also the ability to figure out exactly WHEN it is being run. The only circumstances I can think of when drop may not run are not important from an stm's perspective: aborts or mem:forgetting the transaction guard.

And mem::forget is a real possibility. Another is panic!() since panic's can be caught and the program can continue. This is a default in tokio with async tasks.

I understand your point about defensive programming and trying to avoid misuse of the library, but is introducing a behavior that is opposite of what usually happens in the stdlib the correct way to do this? If you really want to avoid having commit by default, i would at least panic on drop, because it is surprising behavior to automatically abort transactions without an actual erroneous situation.

How is this different to the std::lib? There are no transactional datastructures or types in the rust stdlib. Even crates like im are atomic, not transactional.

The closest parallels here are databases. LMDB is abort on drop unless you call commit (https://docs.rs/lmdb/0.8.0/lmdb/trait.Transaction.html#method.commit). Even the abort call here is just just a drop() call which triggers that.

Sled has the semantics you want, commit by default with an abort on error, but it does this with closures that expect a result, and the abort is triggered on Err() variants.

Regardless, I'm not rewriting this whole library to use closure wrapped transactions, and enforcing "commit must be explicitly called" is correct, and I'm not changing my view on that.