Closed cducrest closed 3 years ago
The problem was identified as coming from the use of websocket for the node rpc in the relay. Web3 uses asyncio to work with websockets and it does not cooperate well with gevent. I spent time reading about it and fiddling to try to find a solution but it is appears impossible. I believe we should drop the support for websockets and revert to http.
I still do not know why it was not a problem in the past and became a problem, but afaict, it should not be fixed by trying to make asyncio and gevent work together.
When Sascha started the version 0.20.2 of the relay, it crashed with traceback:
An important change in between 0.20.1 and 0.20.2 is I upgraded the relay to make it work with python3.8 which means I had to update gevent from 1.4.0 to 21.1.2. I can see related issues with the same problem https://github.com/gevent/gevent/issues/1698 https://github.com/kimbauters/ZIMply/issues/6 but I am not sure why we do monkey.patch_all(thread=False) with thread=False in the boot https://github.com/trustlines-protocol/relay/blob/master/src/relay/boot.py#L8
We cannot reproduce the problem locally or on the staging relay server. Ralf suggested that it could be due to Sascha running Sentry and us not. It appears we run Sentry on the devel server, I could try the update there. Ralf also mentioned the fact that web3 uses asyncio which could be a problem. Another difference is that Sascha uses a webscoket URL while we use http