Optimize archival Mutator Sets, with a focus on IO

aszepieniec commented 7 months ago

Depends on #126.

Sword-Smith commented 6 months ago

@zkclay If you want to have a look at this, maybe have a look at this blogpost. If you understand the mechanics of Merkle Mountain Ranges (MMR) you are 80 % towards understanding mutator sets. A mutator set contains two lists:

an append-only commitment list which represents the outgoing UTXOs, and
an unbounded Bloom filter which represents spent UTXOs.

Both of these lists are represented by MMRs, so the logic to update a mutator set is 80 % MMR logic and 20 % other stuff.

If you don't quite understand the mechanics of MMRs, maybe this blogpost of mine can help.

zkclay commented 6 months ago

Alright, I'll take a look at this! Thanks for the resources!

zkclay commented 6 months ago

Hey @Sword-Smith

I spent some time reading up on the Merkle Mountain Ranges and the Mutator Set docs that you wrote and taking a look at the code. The blogs posts are dense and I had to reread it a few times, but now I have a much better understanding of the two concepts now :) Thanks for the work you and the team has put into them!

However, I’m still a bit unsure as to the optimization strategy we want to pursue here. In taking a look at the mutator set code, it seems like the MMR operations are well encapsulated in the Archival MMR abstraction, and the bulk of the code for the mutator set is concerned with managing the sliding window bloom filter. Are the optimizations we want to do here? I’m still a little hazy on the specifics of the SWBF logic and all the chunks we are manipulating, but I can look deeper into this, if it is the intended focus.

I also noticed that there seems to be some functionality duplicated across the Mutator Set and Mutator Set Accumulator, is this what we want to look into? Or are you thinking of something else entirely?

Sword-Smith commented 6 months ago

My motivation for this issue and #126 was to optimize the time that the node spends when updating its state with a new block. I looked into the archival-MMR and the archival-mutator set code and it didn't look optimal to me. I wrote the original implementation of all of the MMR code and the mutator set code, but multiple (beneficial) rewrites have occurred since then: all access to persistent storage has been made async, which benefits concurrency, and we got rid of two traits: one that tied MmrAccumulator and ArchivalMmr together, and one that tied MutatorSetAccumulator and ArchivalMutatorSet together. We got rid of the two traits because we didn't want the accumulator versions to have to be implemented through async functions. Although these rewrites are net beneficial, I have the sense that some performance was lost in it.

Notice that the accumulator versions store everything in RAM whereas the archival versions handle so much data, that they must live on disk, and they need to be persisted when the program shuts down.

With #126 behind us, I'm happy with the performance of ArchivalMmr.

A stylistic property I don't like about the current ArchivalMutatorSet is how it always goes through the logic from MutatorSetAccumulator. That looks like lazy programming to me. ArchivalMutatorSet has more knowledge than the accumulator-version, so it should be able to handle everything itself. Inspecting the code, it looks like the archival version uses the accumulator version as a library. Maybe something should be factored out so that's not necessary. Something to consider.

On the top of my list of priorities for this issue, though, is that the archival version is fast. So you could start by adding a relevant benchmark and then measure whether your changes make the code run faster. Since the archival version stores everything to disk, I would imagine that optimizing IO will yield the biggest result.

I hope this addresses some of your questions :)

Sword-Smith commented 6 months ago

Here's an example of a speedup I just found: prove for ArchivalMutatorSet goes through prove for the accumulator.

    pub async fn prove(
        &self,
        item: Digest,
        sender_randomness: Digest,
        receiver_preimage: Digest,
    ) -> MsMembershipProof {
        MutatorSetAccumulator::new(
            &self.aocl.get_peaks().await,
            self.aocl.count_leaves().await,
            &self.swbf_inactive.get_peaks().await,
            &self.swbf_active.clone(),
        )
        .prove(item, sender_randomness, receiver_preimage)
    }

And in the accumulator, prove is implemented as:

    pub fn prove(
        &self,
        item: Digest,
        sender_randomness: Digest,
        receiver_preimage: Digest,
    ) -> MsMembershipProof {
        // compute commitment
        let item_commitment = Hash::hash_pair(item, sender_randomness);

        // simulate adding to commitment list
        let auth_path_aocl = self.aocl.to_accumulator().append(item_commitment);
        let target_chunks: ChunkDictionary = ChunkDictionary::default();

        // return membership proof
        MsMembershipProof {
            sender_randomness: sender_randomness.to_owned(),
            receiver_preimage: receiver_preimage.to_owned(),
            auth_path_aocl,
            target_chunks,
        }
    }

So the only thing the ArchivalMutatorSet::prove needs from the accumulator is the MMR authentication path for the next leaf (that hasn't been added yet). Like I said above, this is lazy programming (or programming optimized for developer time, not for runtime cost). We only need an accumulator version of the AOCL MMR for this, not the whole mutator set accumulator. If a new function was added to ArchivalMmr we could even get this authentication path without needing an MmrAccumulator.

zkclay commented 6 months ago

@Sword-Smith ok, all this context is very helpful, I'll take another look. Thanks!

zkclay commented 5 months ago

Hey @Sword-Smith, I took another look at this, but unfortunately there's still a lot for me to wrap my head around here in this mutator set, and I've been making a lot slower progress than I'd like. I will also be pretty busy with some other commitments in the next week, so I want to give this issue up for someone else to work on, rather than hold the team and the roadmap back.

also, as note, I did test cutting out the MutatorSetAccumulator and moving the implementation into the MutatorSet itself, and I didn't see any speedups, so I think that could be left in depending on your preferences for the codebase.

Sword-Smith commented 5 months ago

Hey @Sword-Smith, I took another look at this, but unfortunately there's still a lot for me to wrap my head around here in this mutator set, and I've been making a lot slower progress than I'd like. I will also be pretty busy with some other commitments in the next week, so I want to give this issue up for someone else to work on, rather than hold the team and the roadmap back.

also, as note, I did test cutting out the MutatorSetAccumulator and moving the implementation into the MutatorSet itself, and I didn't see any speedups, so I think that could be left in depending on your preferences for the codebase.

No problem. Thanks for the commits you already contributed. Hope to see you back here at some point :)

Neptune-Crypto / neptune-core

Optimize archival Mutator Sets, with a focus on IO #127