integritee-network / worker

Integritee off-chain worker and sidechain validateer
Apache License 2.0
89 stars 46 forks source link

Bug: Non-determinisic state-hash #421

Closed clangenb closed 2 years ago

clangenb commented 2 years ago

It turns out our approach of comparing state-hashes for sidechain-block import is flawed because the state hash over hash maps is not deterministic due to arbitrary key order, when the state is loaded from disk.

Note: I put keyword #421 at relevant places in code.

brenzi commented 2 years ago

easy fix would be ordering of hashmap

brenzi commented 2 years ago

check how substrate does deteministic hash map

murerfel commented 2 years ago

EDIT: Nevermind my comment below, it turns out it was a mistake in my reproduction of the code. The sidechain crate properly calculates the hash only over the state member of the StfState.

~ The issue IMO does not stem from the underlying hash map. I can write a test that passes using in-memory state with a hash map. The problem arises when we encode, encrypt and write the state to a file and load it again. And the issue, as far as I can tell, is that we only write a portion of what is contained in an StfState to the file (we leave out the state diff). For recapping, this is the StfState: ~

pub struct SgxExternalities {
    pub state: SgxExternalitiesType,
    pub state_diff: SgxExternalitiesDiffType,
}

~ So of course, when we load the StfState again from file, the state_diff will be 'empty' and when we then calculate the hash over the entire StfState, it will be different from what we had before. ~ ~ The question then arises: Is it still a bug and a real problem? Or should the hash of StfState only take into account its state member and not the state_diff? @clangenb
~

murerfel commented 2 years ago

After some more investigation, the problem seems to come from the serialization and de-serialization of the HashMap that is underlying the SgxExternalitiesType. We do our own (possibly home-grown?) serialization, see here. I'll have to dive deeper into that implementation to find out more

clangenb commented 2 years ago

Yes, sorry. I could have told you this much. 😓. Anyhow, good that you came to the same conclusion.

Yes, I reckon that it's most likely due to nondeterministic serialization inside the sgx-libs. I have run some checks to implement encode/decode for hashmaps efficiently in generic no_std environments, but I could unfortunately not find a suitable and simple to implement library back then.

murerfel commented 2 years ago

One solution that I found, is to use BTreeMap instead of HashMap. The binary tree is deterministic, because ordered. In addition, I was able to use serde and serde_json to encode/decode the map, making the sgx-externalities fully compatible with std AND sgx features (and also a few tests that can be run with cargo test). The question now remains: Is it acceptable to have our state in a binary tree BTreeMap instead of a HashMap?

clangenb commented 2 years ago

I have taken a look at the pr there, and I think it looks very nice, and is a good solution, except for the BTreeMap, which I am unsure about. A can't really estimate the absolute performance consequences of the BTreeMap, but is much less performant for sure.

EDIT: The more I think about it, I doubt that this is performant enough.

murerfel commented 2 years ago

What exactly makes you think it's not performant enough? I can't say how it stacks up to manually ordering the hash map every time we compute the hash of the entire state. In general I think it's hard to estimate anything performance related without really benchmarking it.

EDIT: According to some posts (like this one), BTreeMap even outperforms HashMap in certain scenarios. So I think it's not a very clear matter.

EDIT 2: There are also other hash map implementation which preserve order, such as IndexMap. This could be an alternative in case we find out that BTreeMap isn't a valid option

brenzi commented 2 years ago

I suggest to implement something that works for now, so @mullefel please go ahead with what you think is the best approach - which I understand would be BTreeMap, right?. We should just keep an open issue to investigate on performance improvements

clangenb commented 2 years ago

Well, your comparison is only about serialization and deserialization. I am concerned, however, about lookup performance, where hash-map should have constant complexity, whereas tree-maps have logarithmic complexity. Benchmark of map implementations:

https://github.com/Rufflewind/bench-maps.

But you are right, this is trade-off between 'hashing' performance and 'lookup/insertion' performance. So we can't say yet, which option is better.

murerfel commented 2 years ago

I created https://github.com/integritee-network/worker/issues/908 to further investigate.