raiden-network / raiden

Raiden Network
https://developer.raiden.network
Other
1.84k stars 378 forks source link

All Nodes on SP get a MatrixRequestError due to which none of the scenarios work #5259

Closed agatsoh closed 4 years ago

agatsoh commented 5 years ago

Problem Definition

A common error in all the node logs run-xxx.stderr

--------- Starting ---------
Traceback (most recent call last):
  File "/usr/local/bin/raiden", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/raiden/__main__.py", line 13, in main
    run(auto_envvar_prefix="RAIDEN")  # pylint: disable=no-value-for-parameter
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1114, in invoke
    return Command.invoke(self, ctx)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/raiden/ui/cli.py", line 594, in run
    app = runner.run()
  File "/usr/local/lib/python3.7/site-packages/raiden/ui/runners.py", line 245, in run
    return self._start_services()
  File "/usr/local/lib/python3.7/site-packages/raiden/ui/runners.py", line 92, in _start_services
    app_ = run_app(**self._options)
  File "/usr/local/lib/python3.7/site-packages/raiden/ui/app.py", line 371, in run_app
    raiden_app.start()
  File "/usr/local/lib/python3.7/site-packages/raiden/app.py", line 137, in start
    self.raiden.start()
  File "/usr/local/lib/python3.7/site-packages/raiden/raiden_service.py", line 436, in start
    self._start_transport(chain_state)
  File "/usr/local/lib/python3.7/site-packages/raiden/raiden_service.py", line 524, in _start_transport
    prev_auth_data=chain_state.last_transport_authdata,
  File "/usr/local/lib/python3.7/site-packages/raiden/network/transport/matrix/transport.py", line 380, in start
    prev_access_token=prev_access_token,
  File "/usr/local/lib/python3.7/site-packages/raiden/network/transport/matrix/utils.py", line 512, in login_or_register
    client.login(username, password, sync=False)
  File "/usr/local/lib/python3.7/site-packages/matrix_client/client.py", line 270, in login
    "m.login.password", user=username, password=password, device_id=device_id
  File "/usr/local/lib/python3.7/site-packages/matrix_client/api.py", line 160, in login
    return self._send("POST", "/login", content)
  File "/usr/local/lib/python3.7/site-packages/raiden/network/transport/matrix/client.py", line 151, in _send
    return super()._send(method, path, *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/matrix_client/api.py", line 691, in _send
    code=response.status_code, content=response.text
matrix_client.errors.MatrixRequestError: 500: {"errcode":"M_UNKNOWN","error":"Internal server error"}
Exception ignored in: <function SQLiteStorage.__del__ at 0x7f2fcb1de730>
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/raiden/storage/sqlite.py", line 731, in __del__
RuntimeError: The database connection was not closed.
--------- Stopped ---------

And the node-xxx.log

{"event": "Matrix starting", "logger": "raiden.network.transport.matrix.transport", "level": "debug", "timestamp": "2019-11-08 02:05:59.814604"}
{"wait_for": 1.0, "_exception": "MatrixRequestError('500: {\"errcode\":\"M_UNKNOWN\",\"error\":\"Internal server error\"}')", "event": "Got http _send exception, waiting then retrying", "logger": "raiden.network.transport.matrix.client", "level": "debug", "timestamp": "2019-11-08 02:05:59.849142"}
{"wait_for": 1.0, "_exception": "MatrixRequestError('500: {\"errcode\":\"M_UNKNOWN\",\"error\":\"Internal server error\"}')", "event": "Got http _send exception, waiting then retrying", "logger": "raiden.network.transport.matrix.client", "level": "debug", "timestamp": "2019-11-08 02:06:00.878090"}
{"wait_for": 1.0, "_exception": "MatrixRequestError('500: {\"errcode\":\"M_UNKNOWN\",\"error\":\"Internal server error\"}')", "event": "Got http _send exception, waiting then retrying", "logger": "raiden.network.transport.matrix.client", "level": "debug", "timestamp": "2019-11-08 02:06:01.901924"}
{"wait_for": 1.0, "_exception": "MatrixRequestError('500: {\"errcode\":\"M_UNKNOWN\",\"error\":\"Internal server error\"}')", "event": "Got http _send exception, waiting then retrying", "logger": "raiden.network.transport.matrix.client", "level": "debug", "timestamp": "2019-11-08 02:06:02.928061"}
{"wait_for": 1.0, "_exception": "MatrixRequestError('500: {\"errcode\":\"M_UNKNOWN\",\"error\":\"Internal server error\"}')", "event": "Got http _send exception, waiting then retrying", "logger": "raiden.network.transport.matrix.client", "level": "debug", "timestamp": "2019-11-08 02:06:05.870493"}
{"wait_for": 2.0, "_exception": "MatrixRequestError('500: {\"errcode\":\"M_UNKNOWN\",\"error\":\"Internal server error\"}')", "event": "Got http _send exception, waiting then retrying", "logger": "raiden.network.transport.matrix.client", "level": "debug", "timestamp": "2019-11-08 02:06:07.869860"}
{"wait_for": 4.0, "_exception": "MatrixRequestError('500: {\"errcode\":\"M_UNKNOWN\",\"error\":\"Internal server error\"}')", "event": "Got http _send exception, waiting then retrying", "logger": "raiden.network.transport.matrix.client", "level": "debug", "timestamp": "2019-11-08 02:06:10.864412"}
{"wait_for": 5.0, "_exception": "MatrixRequestError('500: {\"errcode\":\"M_UNKNOWN\",\"error\":\"Internal server error\"}')", "event": "Got http _send exception, waiting then retrying", "logger": "raiden.network.transport.matrix.client", "level": "debug", "timestamp": "2019-11-08 02:06:15.863616"}
{"wait_for": 5.0, "_exception": "MatrixRequestError('500: {\"errcode\":\"M_UNKNOWN\",\"error\":\"Internal server error\"}')", "event": "Got http _send exception, waiting then retrying", "logger": "raiden.network.transport.matrix.client", "level": "debug", "timestamp": "2019-11-08 02:06:20.898324"}
{"wait_for": 5.0, "_exception": "MatrixRequestError('500: {\"errcode\":\"M_UNKNOWN\",\"error\":\"Internal server error\"}')", "event": "Got http _send exception, waiting then retrying", "logger": "raiden.network.transport.matrix.client", "level": "debug", "timestamp": "2019-11-08 02:06:25.927322"}
{"wait_for": 5.0, "_exception": "MatrixRequestError('500: {\"errcode\":\"M_UNKNOWN\",\"error\":\"Internal server error\"}')", "event": "Got http _send exception, waiting then retrying", "logger": "raiden.network.transport.matrix.client", "level": "debug", "timestamp": "2019-11-08 02:06:30.962354"}
{"wait_for": 5.0, "_exception": "MatrixRequestError('500: {\"errcode\":\"M_UNKNOWN\",\"error\":\"Internal server error\"}')", "event": "Got http _send exception, waiting then retrying", "logger": "raiden.network.transport.matrix.client", "level": "debug", "timestamp": "2019-11-08 02:06:35.990708"}

Logs of a few scenarios

Scenario_player_Logs-mfee1_flat_fee-1-latest-2019-11-08.tar.gz Scenario_player_Logs-pfs1_get_a_simple_path-1-latest-2019-11-08.tar.gz Scenario_player_Logs-pfs2_simple_no_path-1-latest-2019-11-08.tar.gz Scenario_player_Logs-bf1_basic_functionality-1-latest-2019-11-08.tar.gz

jomuel commented 4 years ago

Closing this as it is obviously caused by a known and fixed issue.

All the given scenario runs are from the 8th of November, when there was trouble with the transport due to the now reverted/fixed #5225. A look into the logs confirms that every single request to the matrix servers resulted in a MatrixRequestError. All scenarios except for bf1 failed during node startup sync already, bf1 immediately afterwards.