Closed clangenb closed 3 years ago
@brenzi, @pifragile FYI.
I have to raise this question again: Our CI shows problems with the bootstrap_demo_community.py script
While registering attestations, claims get rejected because the location is wrong:
2021-10-01 20:56:54.003 DEBUG tokio-runtime-worker encointer: meetup 1 at location Location { lat: 35.23215941201715395437, lon: 18.4075927734375 } should happen at 1633171680000 for cid 0x672dbc5b5940bbf438c2a6217afd23f334b6051945a437f46b33333dbfb05ce2
2021-10-01 20:56:54.003 WARN tokio-runtime-worker encointer: ignoring claim beyond location tolerance: Location { lat: 35.4841563798531680618, lon: 18.543548583984375 }
2021-10-01 20:56:54.003 WARN tokio-runtime-worker encointer: ignoring claim beyond location tolerance: Location { lat: 35.4841563798531680618, lon: 18.543548583984375 }
2021-10-01 20:56:54.005 DEBUG tokio-runtime-worker encointer: meetup 1 at location Location { lat: 35.23215941201715395437, lon: 18.4075927734375 } should happen at 1633171680000 for cid 0x672dbc5b5940bbf438c2a6217afd23f334b6051945a437f46b33333dbfb05ce2
2021-10-01 20:56:54.005 WARN tokio-runtime-worker encointer: ignoring claim beyond location tolerance: Location { lat: 35.4841563798531680618, lon: 18.543548583984375 }
2021-10-01 20:56:54.005 WARN tokio-runtime-worker encointer: ignoring claim beyond location tolerance: Location { lat: 35.4841563798531680618, lon: 18.543548583984375 }
2021-10-01 20:56:54.007 DEBUG tokio-runtime-worker encointer: meetup 1 at location Location { lat: 35.23215941201715395437, lon: 18.4075927734375 } should happen at 1633171680000 for cid 0x672dbc5b5940bbf438c2a6217afd23f334b6051945a437f46b33333dbfb05ce2
the client uses the rpc api to query communities_getLocations
which hits the get_locations
function in the communities pallet:
https://github.com/encointer/pallets/blob/5c6c258d9d3c091963b92ea9553570ce279cda16/communities/rpc/src/lib.rs#L111
while the register_attestation uses the same function
still, the pallet seems to (sometimes) face a different order of locations. In the example log above, the client picked the first location in the community json spec while the dispatchable chose the penultimate
reproduce with https://github.com/encointer/encointer-node/tree/client-bot-fixes
I never experienced this problem with bot-communities.py where locations are generated in random order (but always strictly rectangular)
Mediterranea locations are rather circular: https://github.com/encointer/encointer-node/blob/client-bot-fixes/test-data/test-locations-mediterranean.json
Observation: looking at the geohashes of the two locations in question:
sqm06
and sqm1v
we can see that they are not in neighboring buckets. Maybe mediterranea just has locations very far apart?
neigbors of sqm06 are:
Neighbours:
sqm09 sqm0d sqm0e
sqm03 sqm06 sqm07
sqm01 sqm04 sqm05
I quickly hacked a sorting of locations by lat, then lon. The problem persists, so the cause must be sth else
the problem was the cache dirty key has only been set by new-community, not by add_location. So rpc returned a different set of locations than what is actually in the state
fixed in a2861093b345d88c96e78a92074ac38973ab142f
I was worried about non-determinism because I thought that I read that in substrate, and I also knew that hashmaps do, in general, not always return the same order.
However, looking at the tests from
StorageDoubleMap
it seems that I was wrong:https://github.com/paritytech/substrate/blob/50c84cb8f92b6446b4a7b3043684b116aaea6866/frame/support/src/storage/generator/double_map.rs#L525-559
Furthermore, I realize now that questioning determinism of something that is used in a blockchain runtime is questionable itself. :P