paritytech / polkadot

Polkadot Node Implementation
GNU General Public License v3.0
7.12k stars 1.58k forks source link

"Rejected connection: Transport(i/o error: unexpected end of file" in logs after v0.9.28 upgrade #5956

Closed juliajang closed 1 year ago

juliajang commented 2 years ago

Hi team! I have validators for Polkadot and Kusama and I'm seeing this error log for both Kusama and Polkadot after upgrading to v0.9.28

2022-09-01 15:42:10 Accepting new connection 4/100
2022-09-01 15:42:10 Rejected connection: Transport(i/o error: unexpected end of file

Caused by:
    unexpected end of file)
2022-09-01 15:42:12 ✨ Imported #14263391 (0x2a5f…6fbd)

and these are the flags that are passed when I start my validator

polkadot --base-path /chain/data --chain  kusama --rpc-cors=all --unsafe-rpc-external --unsafe-ws-external --port "40333" --pruning=archive

I'm wondering if I'm missing a flag that is needed or any changes as this only happens after the upgrade to v0.9.28 and not in previous version (ie. v0.9.27 does not show these logs)

bkchr commented 2 years ago

Cc @niklasad1

niklasad1 commented 2 years ago

Hey,

It indeed seems like a bug.

Can you explain how you run your node such as behind a nginx proxy/load balancer or something similar? I have seen something similar on a few nodes but I haven't been able to produce it.

This a socket error that occurs when trying to complete the WS handshake but nothing has really changed regarding that what I can see in the release but could be a regression in jsonrpsee v0.15.1.

juliajang commented 2 years ago

@niklasad1 We run our servers behind a fleet of proxy servers, which sit behind a cloud-hosted Layer 4 load balancer

niklasad1 commented 2 years ago

right, it will be hard for me to try to reproduce that locally.

do you have any idea how to reproduce this or any additional logs to share?

niklasad1 commented 2 years ago

Hey again, I looked at the code again and versions <= polkadot v0.9.27 then we never logged when a connection request failed so the behavior is probably the same as polkadot v0.9.28 it could just be that the client just dropped the connection directly after opening it (but I'm not sure trying to reproduce that myself)

For instance it may be that you have some health check on the WebSocket server and we reply with HTTP status code 403 for any request that isn't an HTTP upgrade request after v0.15.1

See https://github.com/paritytech/jsonrpsee/issues/818 for further information.

chevdor commented 2 years ago

We are seeing this very consistantly on our Burnin machines on Westend (burnin for v0.9.30-rc3).

niklasad1 commented 2 years ago

yeah but the hypothesis is that these HTTP health checks on websocket server as it happens periodically (every 10th seconds or something like that)

These will go away in the next jsonrpsee release anyway which will be a server that support WS and HTTP on the same socket.

jsdw commented 2 years ago

Mmm, looking at a single validator from the logs above, you can see exactly one message every 10 seconds, which is very periodic and would indeed imply to me some automated check that's misconfigured.

chevdor commented 2 years ago

@juliajang could you hint a bit on your infra ? Are you using K8s ?

danforbes commented 1 year ago

I opened this Issue https://github.com/rerun-io/ewebsock/issues/5 because I am seeing this error when I try to connect from the ewebsock WebSocket library. I'm running polkadot 0.9.29-94078b44fb6 with the following command polkadot --chain westend-dev --alice --tmp --rpc-cors all --unsafe-ws-external. I am able to connect to the same node using a JavaScript WebSocket instance.

niklasad1 commented 1 year ago

This will be fixed when https://github.com/paritytech/substrate/pull/12663 is merged

niklasad1 commented 1 year ago

Closing, this should be fixed in polkadot v0.9.36