Neptune-Crypto / neptune-core

anonymous peer-to-peer cash
Apache License 2.0
23 stars 7 forks source link

Evict interior mutability pr #91

Closed dan-da closed 6 months ago

dan-da commented 6 months ago

This PR follows on #89. It completes the refactoring by making all (hopefully) mutating methods take &mut self instead of &self. With this PR in place, I feel we will be in a better place with respect to locking and data-integrity, and reasoning about the code.

So no more calling a method fn do_xyz(&self) and wondering if it will mutate state or not. This makes it very obvious when state is being changed, including on the LevelDB Database directly and on all the twenty_first::storage types, such as DbtVec.

So for example, to modify monitored_utxos, now one must do:

global_state.wallet_state.wallet_db.monitored_utxos_mut().push(utxo);

where in the past one could do:

global_state.wallet_state.wallet_db.monitored_utxos().push(utxo);

As part of this, i moved a couple more things into GlobalState, that seem to fit better there. For example I could not even get WalletState::prune_abandoned_monitored_utxos() to compile (borrow checker complaints) until I moved it.

This depends on changes in twenty_first. Specifically https://github.com/Neptune-Crypto/twenty-first/commit/fac4ead7d0ad0f816064af1948d46689b2430bbf on branch https://github.com/Neptune-Crypto/twenty-first/tree/0_36_backports_mut_self_squashed


How to proceed?

Options would seem to be:

  1. abandon pr 89 and focus on reviewing/merging this PR which includes 89. -- Or --
  2. focus on this after 89 is merged.

I think #2 makes the most sense, because it will be much easier to see the diffs specific to this PR after 89 is merged.

So for this reason, I am just making this a draft PR for now.


Regarding data-integrity, I think it may be useful to distinguish between serialization and atomicity and also touch on transactional.

With the GlobalStateLock in place we should have total serialization of our state, meaning that only one thread/task can mutate global state at a time. So for example, when storing a new block we must update blockchain, mutator set, wallet_state, and mempool. If we acquired and released the lock for each one, we would have serialization of the individual updates, but the individual state objects could become out of sync. Instead, we hold the lock over the entire operation, so that no other thread can mutate until we are all done. Everyone must wait their turn. So we can call this entire sequence of state mutations to store a block a serialized transaction, or just transaction. Similar to a SQL multi-statement transaction, eg begin; insert into foo values(1,2,3); insert into bar values ('x', 'y', 'z'); commit;

So far so good, right?

But what about atomicity? Here is where we fall down.

In a good (ACID) database, each transaction will be fully applied, or not applied at all. The DB guarantees never to write a partial update to disk in a way that a read op could ever see it. But we can't make such a guarantee.

Let's say we are in the middle of storing a block. We have written out blockchain file(s) and the MutatorSet DB and are now updating Wallet DB, let's say. Boom. Power goes off. Well, the LevelDB writes are supposed to be atomic. So that should either be written to disk properly or not. So our blockchain files have the new block, MutatorSet has the new block, WalletDb does not, and Mempool does not. So upon restart, we are in an inconsistent state.

Even though our mutations are serialized, they are not atomic. Not all-or-nothing across all state stores. So that is an area for improvement, and a sizeable task. I don't think we need such robustness for an MVP, but it's a good long term goal to keep in mind. (I don't know if even bitcoin-core has full atomicity of block updates in 2024, for example. interesting question).

An area for near-term review and possible improvement is in the area of updates that should be more transactional. We may be doing some things in smaller serialized mutations that really should be done as a single logical transaction. I'm pretty sure that's the case, but I haven't really focused on it yet, so I can't cite a specific example here and now.

So that's a bit of a brain dump. Probably just stating the obvious. ;-)

dan-da commented 6 months ago

rebased to master.

Sword-Smith commented 6 months ago

Looks good to me. (At least) one question, though: This relies on an updated version of twenty-first. We prefer to have release versions of neptune-core depend on crates from crates.io, right?

dan-da commented 6 months ago

Looks good to me. (At least) one question, though: This relies on an updated version of twenty-first. We prefer to have release versions of neptune-core depend on crates from crates.io, right?

yes, that's why I've been raising the issue that triton-vm and tasm-lib and by extension neptune-core do not build against the current twenty-first master. iirc, the previous PR also relied on a (different) twenty-first branch commit.

As I see it:

So yeah, that's what makes sense to me. If the team has been working another way, please let me know.

Sword-Smith commented 6 months ago

As I see it:

* release versions should only rely on crates.io published packages.

* in between releases, a patch section in Cargo.toml can be used to specify specific github revisions of crates that are  undergoing simultaneous development, eg tasm-lib, twenty-first, etc.

* ideally, if someone makes a breaking change in a library such as twenty-first, they will also coordinate with the dependent crates asap to ensure they continue building.

* when publishing to crates.io, the lowest lib gets published first eg twenty-first, and then update next higher (eg triton-vm) Cargo.toml to use published version, and so on.

So yeah, that's what makes sense to me. If the team has been working another way, please let me know.

This sounds good to me :+1: