It appears that the greenlet responsible for getting updates from ethindex and applying it to the graph dies.
It dies because we try to send updates for old events in here: https://github.com/trustlines-protocol/relay/blob/master/src/relay/relay.py#L763
When we publish the event for a trustline we will treat the events from oldest to newest, it could be that a trustline was updated on the graph with timestamp 100 and we try to publish an event with timestamp 30.
The easy solution is to publish the event only once per trustline and once per user and only publish the most recent one.
A better solution is to decouple all this stuff, but I am not sure we want to take the time for this now. I guess it could be that the graph module should be responsible for sending updates instead of the relay module since. Right now the relay is making the coordination of fetching the updates, updating the graph, and publish the updates (which requires to call once again on the graph to get its status). We should probably make it so that the graph sends out the event at the same time it updates itself.
The problem was hard to find because when the greenlet dies, it will not "bubble up" its exception. We should make the relay crash when this greenlet dies with a code like: greenlet.link_exception(lambda *args: sys.exit("important greenlet died"))
It appears that the greenlet responsible for getting updates from ethindex and applying it to the graph dies. It dies because we try to send updates for old events in here: https://github.com/trustlines-protocol/relay/blob/master/src/relay/relay.py#L763 When we publish the event for a trustline we will treat the events from oldest to newest, it could be that a trustline was updated on the graph with timestamp 100 and we try to publish an event with timestamp 30.
Upon publishing the event, we calculate the balance with interests of the trustline: https://github.com/trustlines-protocol/relay/blob/2a807deb2941e79bef5c436bc32e0fd3679a0397/src/relay/relay.py#L899. Which will crash when the time difference is (too) negative: https://github.com/trustlines-protocol/relay/blob/master/src/relay/network_graph/interests.py#L14
The easy solution is to publish the event only once per trustline and once per user and only publish the most recent one.
A better solution is to decouple all this stuff, but I am not sure we want to take the time for this now. I guess it could be that the graph module should be responsible for sending updates instead of the relay module since. Right now the relay is making the coordination of fetching the updates, updating the graph, and publish the updates (which requires to call once again on the graph to get its status). We should probably make it so that the graph sends out the event at the same time it updates itself.
The problem was hard to find because when the greenlet dies, it will not "bubble up" its exception. We should make the relay crash when this greenlet dies with a code like:
greenlet.link_exception(lambda *args: sys.exit("important greenlet died"))