Closed tegefaulkes closed 8 months ago
This needs to be tested across platforms if the behaviour is consistent.
But it should also be treated as a per-connection error.
So basically the problem is that we're conflating problems with the socket causing errors with problems with send causing errors. The main one we ran up against is invalid input to send causing an EINVAL
error since the address was invalid for how the socket was bound.
The solution here is to take any error coming out of the send and checking if it's a problem specific to the connection or packet send attempt, or some grander problem with the socket itself. If the problem is specific to only whatever was using the send at the time then we need to pass the error along to that.
This leads on to further work where we make js-quic
tolerant of any network dropouts because I noticed when unplugging the Ethernet cable the node crashes. Actually, I'll make a bug report for that.
The solution here is to take any error coming out of the send and checking if it's a problem specific to the connection or packet send attempt, or some grander problem with the socket itself. If the problem is specific to only whatever was using the send at the time then we need to pass the error along to that.
Yes this is needed, but not by changing the dataflow directions. This is best done with a combination of 2 things:
That way it's possible to throw up the errors/dispatching the right events, and have the relevant consumers decide whether it is relevant to them. Even connection IDs can be part of these events, so that a particular QUICConnection
can check if it is relevant to themselves.
Remember to assign issues to yourself if you're working on them.
Re-opening this. There were some problems with the CI after merging #86.
I'll be applying fixes to the same branch feature-socket-errors
Describe the bug
There have been test failures in the
intergration:docker
ci job for thePolykey-cli
. Exploring the problem locally we found that under the configuration and parameters used for the tests was causing theQUICSocket._send
to error out withEINVAL
.From what we can tell the the
PolykeyAgent
is starting while bound to a local address127.0.0.1
but during the course of the testing connections are being attempted to the externaltestnet
with an address such as3.139.146.137
. SO specifically the combination of host networking in docker, binding to theloopback
and trying to connect externally is causing anEINVAL
error inQUICSocket._send
. Which bubbles up to the top of the process and crashes the program.The following is the
console.error
output of the error after running into the problem. We obtained this by adding theconsole.error()
to the handlers for uncaught exceptions.I suspect this may be platform specific to docker since this problem never occurs during normal testing.
To Reproduce
polykeyCLI
repo you need to build the docker image using the normal methods shown in theREADME.MD
.docker run --network host -it $image agent start --network testnet -np /tmp --verbose --agent-host 127.0.0.1
ErrorQUICClientInternal
error and crash.Expected behaviour
Even when bound to a loopback, connection attempts to external addresses shouldn't crash. These should be errors specific to the connection being made. In fact any
EINVAL
errors should be thrown back to the connection in some fashion and never be taken as a full failure of the socket.Platform (please complete the following information)
Additional context
Notify maintainers