Open QuantumEntangledAndy opened 3 years ago
What would be a good way to handle time out at this point?
It's been a while for this code :) ... the intent here was to have the timeout at the socket layer:
and then, failure to read would cause the OS to hang up the socket, we'd get 0 bytes (end of stream), and the top loop will deal with reconnecting.
It's interesting that you get std::io::ErrorKind::WouldBlock
now that I think about it -- my understanding is that you only get this if you ask for a non-blocking read. Roughly it means "I would return zero here, but that meaning is already taken." For blocking reads, reads that would block will, um, block.
(Side note, -EENDOFSTREAM
might have been a better design decision but that's only first glance. If I ever get a time machine I'll take it up with the Multics guys.)
Does this make sense?
I read a bit about it and whether it returns wouldblock being a bit of an os dependant thing. Perhaps I hit a corner case where both clones of the socket (remember the try_clone
before the polling thread) tried to do something at the same time. But because I was sending 100s of packets in quick succession it was repeatable.
I've also noticed something odd here which may be to do with what IoErrors my Mac sends:
When I'm working over a latent connection (SSH tunnel over cellular) I edit the code to bump up the timeout to a very large value (30mins) but I still get Read returned zero bytes. Perhaps there's another corner case where the socket can geniually return zero bytes?
Yeah, I think it's still reasonable for you to get an EWOULDBLOCK.
I still get Read returned zero bytes
Ooh neat. I'd really like to see that in Wireshark. It could be something in the tunnel chain saying "screw this connection in particular, I don't want to keep track of it" and sending an RST.
Idle sockets are supposed to stay connected, that's like internet 101, but lots of things (NAT, VPNs, etc) monkey with this ideal.
It would be really interesting if you saw a zero-byte read without a corresponding RST.
So I wanted to discuss this code.
https://github.com/thirtythreeforty/neolink/blob/f5fd552f2566fdfc278877482264f0a9a5986ae4/src/bc/de.rs#L57-L66
For awhile I've noticed that the timeout is ignored during deser even if
RX_TIMEOUT
is set to ubsurd long values like 30mins and I think it is because of this code.The take code simply tries to pull from the tcp stream and if no messages have come since the last read then it will return
Read returned 0 bytes
.Additionally in my
talk
code I sent a lot of message in quick succession and this caused the tcp stream to throw a Temporarily Unavailable error in thistake
code. In therefractor
PR #185 I've addressed this asWhich addresses the temporarily unavailable resource issue. But I feel that it should also have a timeout before it decides that it should that it is an unexpected EOF