waku-org / nwaku

Waku node and protocol.
Other
203 stars 54 forks source link

bug: Relay publish returns Failed to publish: timedout when a peer filter node is disconnected #2319

Closed fbarbu15 closed 5 months ago

fbarbu15 commented 11 months ago

To reproduce

  1. Have a relay node
  2. Connect 1 or more filter nodes to the relay node
  3. Subscribe to some topics and verify that messages published via Relay reach the Filter nodes
  4. Pause or disconnect one of the filter containers
  5. Publish a new message from the relay node

Actual behavior

500 Server Error with message Failed to publish: timedout

Expected behavior

Publish should work

nwaku version/commit hash

harbor.status.im/wakuorg/nwaku:latest

Docker logs

node1_2023-12-27_16-48-5911ce13f6-d370-4692-995e-e4ea50cf4218harbor.status.im_wakuorg_nwakulatest.log node2_2023-12-27_16-48-5911ce13f6-d370-4692-995e-e4ea50cf4218harbor.status.im_wakuorg_nwakulatest.log

Test logs

16:48:59 src.node.docker_mananger DEBUG Starting container with image harbor.status.im/wakuorg/nwaku:latest
16:48:59 src.node.docker_mananger DEBUG Using args ['--listen-address=0.0.0.0', '--rpc=true', '--rpc-admin=true', '--rest=true', '--rest-admin=true', '--websocket-support=true', '--log-level=TRACE', '--rest-relay-cache-capacity=100', '--websocket-port=34018', '--rpc-port=34016', '--rest-port=34015', '--tcp-port=34017', '--discv5-udp-port=34019', '--rpc-address=0.0.0.0', '--rest-address=0.0.0.0', '--nat=extip:172.18.114.188', '--peer-exchange=true', '--discv5-discovery=true', '--cluster-id=0', '--metrics-server=true', '--metrics-server-address=0.0.0.0', '--metrics-server-port=34020', '--metrics-logging=true', '--relay=true', '--filter=true', '--nodekey=30348dd51465150e04a5d9d932c72864c8967f806cce60b5d26afeca1e77eb68']
16:49:00 src.node.docker_mananger DEBUG Container started with ID d0528d7acb4f. Setting up logs at ./log/docker/node1_2023-12-27_16-48-59__11ce13f6-d370-4692-995e-e4ea50cf4218__harbor.status.im_wakuorg_nwaku:latest.log
16:49:00 src.node.waku_node DEBUG Started container from image harbor.status.im/wakuorg/nwaku:latest. RPC: 34016 REST: 34015 WebSocket: 34018 TCP: 34017
16:49:00 src.libs.common DEBUG Sleeping for 1 seconds
16:49:01 src.node.api_clients.base_client INFO GET call: http://127.0.0.1:34015/debug/v1/info with payload: None
16:49:01 src.node.api_clients.base_client INFO Response status code: 200. Response content: b'{"listenAddresses":["/ip4/172.18.114.188/tcp/34017/p2p/16Uiu2HAmGNtM2rQ8abySFNhqPDFY4cmfAEpfo9Z9fD3NekoFR2ip","/ip4/172.18.114.188/tcp/34018/ws/p2p/16Uiu2HAmGNtM2rQ8abySFNhqPDFY4cmfAEpfo9Z9fD3NekoFR2ip"],"enrUri":"enr:-Kq4QKq0U3Cy4-30vayvPFEcu0s6aU-aCoKf3RvktZKnjGkUH9Kl-W07aF4w6ihJuNdLZ7mcFntDH0AQvvL31_doatwBgmlkgnY0gmlwhKwScryKbXVsdGlhZGRyc4wACgSsEnK8BoTi3QOJc2VjcDI1NmsxoQM3Tqpf5eFn4Jztm4gB0Y0JVSJyxyZsW8QR-QU5DZb-PYN0Y3CChOGDdWRwgoTjhXdha3UyBQ"}'
16:49:01 src.node.waku_node INFO REST service is ready !!
16:49:01 src.steps.filter DEBUG Running fixture setup: setup_main_filter_node
16:49:01 src.node.docker_mananger DEBUG Docker client initialized with image harbor.status.im/wakuorg/nwaku:latest
16:49:01 src.node.waku_node DEBUG WakuNode instance initialized with log path ./log/docker/node2_2023-12-27_16-48-59__11ce13f6-d370-4692-995e-e4ea50cf4218__harbor.status.im_wakuorg_nwaku:latest.log
16:49:01 src.node.waku_node DEBUG Starting Node...
16:49:01 src.node.docker_mananger DEBUG Attempting to create or retrieve network waku
16:49:01 src.node.docker_mananger DEBUG Network waku already exists
16:49:01 src.node.docker_mananger DEBUG Generated random external IP 172.18.121.213
16:49:01 src.node.docker_mananger DEBUG Generated ports [4971, 4972, 4973, 4974, 4975, 4976]
16:49:01 src.node.docker_mananger DEBUG Starting container with image harbor.status.im/wakuorg/nwaku:latest
16:49:01 src.node.docker_mananger DEBUG Using args ['--listen-address=0.0.0.0', '--rpc=true', '--rpc-admin=true', '--rest=true', '--rest-admin=true', '--websocket-support=true', '--log-level=TRACE', '--rest-relay-cache-capacity=100', '--websocket-port=4974', '--rpc-port=4972', '--rest-port=4971', '--tcp-port=4973', '--discv5-udp-port=4975', '--rpc-address=0.0.0.0', '--rest-address=0.0.0.0', '--nat=extip:172.18.121.213', '--peer-exchange=true', '--discv5-discovery=true', '--cluster-id=0', '--metrics-server=true', '--metrics-server-address=0.0.0.0', '--metrics-server-port=4976', '--metrics-logging=true', '--relay=false', '--filter=true', '--discv5-bootstrap-node=enr:-Kq4QKq0U3Cy4-30vayvPFEcu0s6aU-aCoKf3RvktZKnjGkUH9Kl-W07aF4w6ihJuNdLZ7mcFntDH0AQvvL31_doatwBgmlkgnY0gmlwhKwScryKbXVsdGlhZGRyc4wACgSsEnK8BoTi3QOJc2VjcDI1NmsxoQM3Tqpf5eFn4Jztm4gB0Y0JVSJyxyZsW8QR-QU5DZb-PYN0Y3CChOGDdWRwgoTjhXdha3UyBQ', '--filternode=/ip4/172.18.114.188/tcp/34017/p2p/16Uiu2HAmGNtM2rQ8abySFNhqPDFY4cmfAEpfo9Z9fD3NekoFR2ip']
16:49:01 src.node.docker_mananger DEBUG Container started with ID 58760c0ce910. Setting up logs at ./log/docker/node2_2023-12-27_16-48-59__11ce13f6-d370-4692-995e-e4ea50cf4218__harbor.status.im_wakuorg_nwaku:latest.log
16:49:01 src.node.waku_node DEBUG Started container from image harbor.status.im/wakuorg/nwaku:latest. RPC: 4972 REST: 4971 WebSocket: 4974 TCP: 4973
16:49:01 src.libs.common DEBUG Sleeping for 1 seconds
16:49:02 src.node.api_clients.base_client INFO GET call: http://127.0.0.1:4971/debug/v1/info with payload: None
16:49:02 src.node.api_clients.base_client INFO Response status code: 200. Response content: b'{"listenAddresses":["/ip4/172.18.121.213/tcp/4973/p2p/16Uiu2HAm2AGPYn4pCquYbabKB7fv43nqHjV9ooXCuESAzsdigUNW","/ip4/172.18.121.213/tcp/4974/ws/p2p/16Uiu2HAm2AGPYn4pCquYbabKB7fv43nqHjV9ooXCuESAzsdigUNW"],"enrUri":"enr:-Kq4QMUlB3SRC-zw69dOBDjuMIxhl1ZotBzHTNotsXr6o4EGLCy3JwE7sL8rItfYqw0dLZw2vHueHxKPIwdsibxth-gBgmlkgnY0gmlwhKwSedWKbXVsdGlhZGRyc4wACgSsEnnVBhNu3QOJc2VjcDI1NmsxoQJkD2wtUARcDlXeCxpgSYQ6rsqxgrxAaC9ROduQnqIr04N0Y3CCE22DdWRwghNvhXdha3UyBA"}'
16:49:02 src.node.waku_node INFO REST service is ready !!
16:49:02 src.steps.filter DEBUG Running fixture setup: subscribe_main_nodes
16:49:02 src.node.api_clients.base_client INFO POST call: http://127.0.0.1:34015/relay/v1/subscriptions with payload: ["/waku/2/rs/0/0"]
16:49:02 src.node.api_clients.base_client INFO Response status code: 200. Response content: b'OK'
16:49:02 src.node.api_clients.base_client INFO POST call: http://127.0.0.1:4971/filter/v2/subscriptions with payload: {"requestId": "909311a0-ec16-41eb-b5da-16b335a609aa", "contentFilters": ["/test/1/waku-filter/proto"], "pubsubTopic": "/waku/2/rs/0/0"}
16:49:02 src.node.api_clients.base_client INFO Response status code: 200. Response content: b'{"requestId":"909311a0-ec16-41eb-b5da-16b335a609aa","statusCode":0,"statusDesc":""}'
------------------------------------------------------------------------------------------------ live log call ------------------------------------------------------------------------------------------------
16:49:02 src.node.api_clients.base_client INFO POST call: http://127.0.0.1:34015/relay/v1/messages/%2Fwaku%2F2%2Frs%2F0%2F0 with payload: {"payload": "RmlsdGVyIHdvcmtzISE=", "contentTopic": "/test/1/waku-filter/proto", "timestamp": 1703688542958019328}
16:49:02 src.node.api_clients.base_client INFO Response status code: 200. Response content: b'OK'
16:49:02 src.libs.common DEBUG Sleeping for 0.1 seconds
16:49:03 src.steps.filter DEBUG Checking that peer NODE_1:harbor.status.im/wakuorg/nwaku:latest can find the published message
16:49:03 src.node.api_clients.base_client INFO GET call: http://127.0.0.1:4971/filter/v2/messages/%2Ftest%2F1%2Fwaku-filter%2Fproto with payload: None
16:49:03 src.node.api_clients.base_client INFO Response status code: 200. Response content: b'[{"payload":"RmlsdGVyIHdvcmtzISE=","contentTopic":"/test/1/waku-filter/proto","version":0,"timestamp":1703688542958019328}]'
16:49:03 src.node.waku_node DEBUG Pausing container with id 58760c0ce910
16:49:03 src.node.api_clients.base_client INFO POST call: http://127.0.0.1:34015/relay/v1/messages/%2Fwaku%2F2%2Frs%2F0%2F0 with payload: {"payload": "RmlsdGVyIHdvcmtzISE=", "contentTopic": "/test/1/waku-filter/proto", "timestamp": 1703688543074802688}
16:49:08 src.node.api_clients.base_client ERROR HTTP error occurred: 500 Server Error: Internal Server Error for url: http://127.0.0.1:34015/relay/v1/messages/%2Fwaku%2F2%2Frs%2F0%2F0. Response content: b'Failed to publish: timedout'
gabrielmer commented 5 months ago

Hi @fbarbu15! Is there a test case I can run to reproduce this issue?

fbarbu15 commented 5 months ago

Hey, there were 2 tests but I just retested them on the latest nwaku version and the issue no longer reproduces. I guess it was fixed in the meantime as part of another task