hyperledger / indy-node

The server portion of a distributed ledger purpose-built for decentralized identity.
https://wiki.hyperledger.org/display/indy
Apache License 2.0
685 stars 657 forks source link

Revocation Registry Entry update depends on the ordering of sets #1769

Closed c2bo closed 2 years ago

c2bo commented 2 years ago

This should explain the behavior from #1750.

When a revocation registry entry is received, the corresponding handler will find out the revocation registry strategy of the corresponding revocation registry and use the one of the 2 implementations provided in revocation_strategy.py:

The code below uses a set to aggregate the existing revocations with the newly added ones and converts back to a list after. This results in undefined behavior regarding ordering: A set is unordered in python and by converting a set to a list without sorting, the list will have the same order which is not guaranteed.

https://github.com/hyperledger/indy-node/blob/1e24fc2fb5eeb8778cf64310e96da23a2c279985/indy_node/server/revocation_strategy.py#L140-L144

I created a small test program mimicking the current implementation:

def add(a, b):
    seta = set(a)
    seta.update(b)
    new_val = list(seta)
    return new_val

current = [1]
print(current)
current = add(current, [5,7,6])
print(current)
current = add(current, [8])
print(current)

This creates different behavior depending on python implementation:

python:3.5-buster:

Python 3.5.10 (default, Sep 10 2020, 18:30:47)
[GCC 8.3.0] on linux
[1]
[1, 5, 6, 7]
[8, 1, 5, 6, 7]

python:3.8-buster:

Python 3.8.13 (default, Aug  2 2022, 11:55:19)
[GCC 8.3.0] on linux
[1]
[1, 5, 6, 7]
[1, 5, 6, 7, 8]

We can see that the ordering behaves differently (5,7,6 gets sorted - 8 does not)

In indy, this results in the state trie diverging and the catchup of a node using newer versions of python to fail.

c2bo commented 2 years ago

Behavior of this seems to have changed with python3.6

WadeBarnes commented 2 years ago

Great find

c2bo commented 2 years ago

Just to be sure this doesn't get forgotten, this problem is present in both revocation strategies:

https://github.com/hyperledger/indy-node/blob/1e24fc2fb5eeb8778cf64310e96da23a2c279985/indy_node/server/revocation_strategy.py#L140-L144

https://github.com/hyperledger/indy-node/blob/1e24fc2fb5eeb8778cf64310e96da23a2c279985/indy_node/server/revocation_strategy.py#L198-L202

swcurran commented 2 years ago

Suggestion on how to avoid a config for the issue. Can the code run the signature verifier and if it fails on a rev-reg-entry, regenerate the list to force it to "the old way", reverify the signature and accept that? That way we can move forward with the upgrade to Python and not have to have a startup parameter that says when this change went into efffect.

WadeBarnes commented 2 years ago

Or is something that can be fixed and we could use a migration script for; https://github.com/hyperledger/indy-node/tree/main/data/migrations

swcurran commented 2 years ago

Wild -- do those scripts allow for updating transactions/signatures/roots on existing transactions? They would be able to fix this issue?

c2bo commented 2 years ago

Wouldn't we still have an issue with having incorrect state root hashes in the audit ledger breaking the state proofs?

WadeBarnes commented 2 years ago

Wild -- do those scripts allow for updating transactions/signatures/roots on existing transactions? They would be able to fix this issue?

The migration scripts have been used for a variety of reasons in the past; data migrations, config updates, etc.

c2bo commented 2 years ago

Suggestion on how to avoid a config for the issue. Can the code run the signature verifier and if it fails on a rev-reg-entry, regenerate the list to force it to "the old way", reverify the signature and accept that? That way we can move forward with the upgrade to Python and not have to have a startup parameter that says when this change went into efffect.

That is another issue I realized while this happened during catchup: This happened at domain tx 843 but the node did not indicate any kind of error and went through all the domain transactions. I do believe that state root hashes are currently not verified during catchup. That is something I planned to understand this week and create a separate issue if this is correct.

c2bo commented 2 years ago

Fixed by #1777 #1783 #1775 #1770, can we close this or should we wait for the final testing and node 1.13 release?

WadeBarnes commented 2 years ago

I think it's safe to close.