encointer / pallets

all application-specific pallets for encointer
GNU General Public License v3.0
19 stars 3 forks source link

Investigate that storage map iterator returns deterministic order #60

Closed clangenb closed 3 years ago

clangenb commented 3 years ago

I was worried about non-determinism because I thought that I read that in substrate, and I also knew that hashmaps do, in general, not always return the same order.

However, looking at the tests from StorageDoubleMap it seems that I was wrong:

https://github.com/paritytech/substrate/blob/50c84cb8f92b6446b4a7b3043684b116aaea6866/frame/support/src/storage/generator/double_map.rs#L525-559

Furthermore, I realize now that questioning determinism of something that is used in a blockchain runtime is questionable itself. :P

clangenb commented 3 years ago

@brenzi, @pifragile FYI.

brenzi commented 3 years ago

I have to raise this question again: Our CI shows problems with the bootstrap_demo_community.py script

While registering attestations, claims get rejected because the location is wrong:

2021-10-01 20:56:54.003 DEBUG tokio-runtime-worker encointer: meetup 1 at location Location { lat: 35.23215941201715395437, lon: 18.4075927734375 } should happen at 1633171680000 for cid 0x672dbc5b5940bbf438c2a6217afd23f334b6051945a437f46b33333dbfb05ce2    
2021-10-01 20:56:54.003  WARN tokio-runtime-worker encointer: ignoring claim beyond location tolerance: Location { lat: 35.4841563798531680618, lon: 18.543548583984375 }    
2021-10-01 20:56:54.003  WARN tokio-runtime-worker encointer: ignoring claim beyond location tolerance: Location { lat: 35.4841563798531680618, lon: 18.543548583984375 }    
2021-10-01 20:56:54.005 DEBUG tokio-runtime-worker encointer: meetup 1 at location Location { lat: 35.23215941201715395437, lon: 18.4075927734375 } should happen at 1633171680000 for cid 0x672dbc5b5940bbf438c2a6217afd23f334b6051945a437f46b33333dbfb05ce2    
2021-10-01 20:56:54.005  WARN tokio-runtime-worker encointer: ignoring claim beyond location tolerance: Location { lat: 35.4841563798531680618, lon: 18.543548583984375 }    
2021-10-01 20:56:54.005  WARN tokio-runtime-worker encointer: ignoring claim beyond location tolerance: Location { lat: 35.4841563798531680618, lon: 18.543548583984375 }    
2021-10-01 20:56:54.007 DEBUG tokio-runtime-worker encointer: meetup 1 at location Location { lat: 35.23215941201715395437, lon: 18.4075927734375 } should happen at 1633171680000 for cid 0x672dbc5b5940bbf438c2a6217afd23f334b6051945a437f46b33333dbfb05ce2 

the client uses the rpc api to query communities_getLocations which hits the get_locations function in the communities pallet: https://github.com/encointer/pallets/blob/5c6c258d9d3c091963b92ea9553570ce279cda16/communities/rpc/src/lib.rs#L111

while the register_attestation uses the same function

https://github.com/encointer/pallets/blob/5c6c258d9d3c091963b92ea9553570ce279cda16/ceremonies/src/lib.rs#L611

still, the pallet seems to (sometimes) face a different order of locations. In the example log above, the client picked the first location in the community json spec while the dispatchable chose the penultimate

reproduce with https://github.com/encointer/encointer-node/tree/client-bot-fixes

I never experienced this problem with bot-communities.py where locations are generated in random order (but always strictly rectangular)

Mediterranea locations are rather circular: https://github.com/encointer/encointer-node/blob/client-bot-fixes/test-data/test-locations-mediterranean.json

brenzi commented 3 years ago

Observation: looking at the geohashes of the two locations in question: sqm06 and sqm1v we can see that they are not in neighboring buckets. Maybe mediterranea just has locations very far apart?

neigbors of sqm06 are:

Neighbours:
sqm09   sqm0d   sqm0e
sqm03   sqm06   sqm07
sqm01   sqm04   sqm05
brenzi commented 3 years ago

I quickly hacked a sorting of locations by lat, then lon. The problem persists, so the cause must be sth else

brenzi commented 3 years ago

the problem was the cache dirty key has only been set by new-community, not by add_location. So rpc returned a different set of locations than what is actually in the state

fixed in a2861093b345d88c96e78a92074ac38973ab142f