Open cducrest opened 4 years ago
Detect which events are missing or added after reorgs, (either on relay or py-eth-index) and react upon that on the graph.
When applying an event, store the prior/post state together with the event. When a reorg occurs and events are missing, undo the missing events backwards and apply new events forwards. Nothing is stored in the graph, only alongside the events in the data structure used to check for new / missing events.
-> potentially more encapsulated than idea1
Stop using filters to get events for the graph, but get them from the indexer. Since events are always describing the final state of a trustline, it is fine to reapply events multiple times as long as the last event applied is the last event chronologically in the chain.
Pull and store all latest events every 5 seconds.
We have a set latest_events
and latest_applied_events
.
We pull and store all latest event since block reorg_safe_block_nr + 1
in latest_events
.
We check for missing events by checking for every event in latest_applied_events
if it is in latest_events
.
We check for added events by checking for events in latest_events
but not in latest_applied_events
.
We handle the added/missing events.
We update latest_applied_events
by removing/adding events.
Set reorg_safe_block_nr
to latest_block - reorg_safety
.
Delete all events in latest_applied_events
that we consider finalised since we will not pull them in the next iteration.
-> this mechanism could replace the current way we pull / receive events. -> This could be adapted to check for added events more often than for missing events
Project to fix the graph sync
Current way it works
Events from filter
When we start the relay, it will get the list of addresses from the file
addresses.json
and for each address will start_start_listen_network(address)
: https://github.com/trustlines-protocol/relay/blob/3c7ac8e68aff8c65543e85975c4afa293a8a515d/src/relay/relay.py#L753This will start listener on each events (trustline updates, trasnfer, balance updates, etc ...).
The listeners are greenlets that get new entries on a filter every seconds: https://github.com/trustlines-protocol/relay/blob/3c7ac8e68aff8c65543e85975c4afa293a8a515d/src/relay/blockchain/proxy.py#L142
https://github.com/trustlines-protocol/relay/blob/3c7ac8e68aff8c65543e85975c4afa293a8a515d/src/relay/blockchain/proxy.py#L26-37
The filter is a regular web3 filter that gets notified by the blockchain node (parity) when an event for the selected address and type occur.
When events are seen, they trigger changes in the graph and send push notifications to the user: https://github.com/trustlines-protocol/relay/blob/3c7ac8e68aff8c65543e85975c4afa293a8a515d/src/relay/relay.py#L807-L820
State from querying node
The problem is that filters do not handle forks, filters won't be notified in any means by the node when an event is no longer here due to a reorg for example
The way we handle that is by starting a sync process at the same time we start listening on events: https://github.com/trustlines-protocol/relay/blob/3c7ac8e68aff8c65543e85975c4afa293a8a515d/src/relay/relay.py#L757
This function will start a periodic process (by default every 5 min) that will regenerate the graph by directly querying the state of the blockchain to the node. This does not use events. https://github.com/trustlines-protocol/relay/blob/3c7ac8e68aff8c65543e85975c4afa293a8a515d/src/relay/blockchain/currency_network_proxy.py#L68
This should allow us to be "eventually" correct on the graph.
Problems
1) It can occur that while we are syncing the graph by querying the node, events come to update the graph via filters. The graph regenerated from the state will come to erase the previous graph, thus erasing the update of the event.
2) When getting events from the filter, there is no guarantee as far as I know that events are ordered in the chronological order blockchain-wise (blocknumber, logindex). Since we collect events every seconds, it could also occur that we get the older event (blockchain-wise) in the earlier second (relay time wise) and the earlier event (blockchain wise) in the later second (relay time wise), producing a wrong result.
3) We have two sources of truth in the realy: the events from the node, and the ethindex. These might disagree with each other and produce ambiguous behaviours.
4) Regenerating the graph every 5 min is probably not viable if the graph gets too big.
Potentially Easy Solutions
For problem 1) instead of recreating the whole graph and applying it all at once, we could apply it trustlines per trustlines, considerably reducing the odds that an event modify a trustlines while it is being updated. However, during the update process, the graph is a mismatch of different sources of information and might create odds results for example when someone asks for a path.
For problem 2), we can order the events we get from the filter. That does not solve the problem that events might not be ordered in between two times where we query the filter.