thirtythreeforty / neolink

An RTSP bridge to Reolink IP cameras
https://www.thirtythreeforty.net/posts/2020/05/hacking-reolink-cameras-for-fun-and-profit/
GNU Affero General Public License v3.0
889 stars 147 forks source link

Should this have a timeout #188

Open QuantumEntangledAndy opened 3 years ago

QuantumEntangledAndy commented 3 years ago

So I wanted to discuss this code.

https://github.com/thirtythreeforty/neolink/blob/f5fd552f2566fdfc278877482264f0a9a5986ae4/src/bc/de.rs#L57-L66

For awhile I've noticed that the timeout is ignored during deser even if RX_TIMEOUT is set to ubsurd long values like 30mins and I think it is because of this code.

The take code simply tries to pull from the tcp stream and if no messages have come since the last read then it will return Read returned 0 bytes.

Additionally in my talk code I sent a lot of message in quick succession and this caused the tcp stream to throw a Temporarily Unavailable error in this take code. In the refractor PR #185 I've addressed this as

loop {
            match (&mut rdr)
                .take(to_read.get() as u64)
                .read_to_end(&mut input)
            {
                Ok(0) => {
                    return Err(std::io::Error::new(
                        std::io::ErrorKind::UnexpectedEof,
                        "Read returned 0 bytes",
                    )
                    .into());
                }
                Ok(_) => break,
                Err(e) if e.kind() == std::io::ErrorKind::WouldBlock => {
                    // This is a temporaily unavaliable resource
                    // We should just wait and try again
                }
                Err(e) => {
                    return Err(e.into());
                }
            }
        }

Which addresses the temporarily unavailable resource issue. But I feel that it should also have a timeout before it decides that it should that it is an unexpected EOF

QuantumEntangledAndy commented 3 years ago

What would be a good way to handle time out at this point?

thirtythreeforty commented 3 years ago

It's been a while for this code :) ... the intent here was to have the timeout at the socket layer:

https://github.com/thirtythreeforty/neolink/blob/5fc92f0bf451a8a7bdc2108d8cad286783c6d281/crates/core/src/bc_protocol/connection.rs#L203-L206

and then, failure to read would cause the OS to hang up the socket, we'd get 0 bytes (end of stream), and the top loop will deal with reconnecting.

It's interesting that you get std::io::ErrorKind::WouldBlock now that I think about it -- my understanding is that you only get this if you ask for a non-blocking read. Roughly it means "I would return zero here, but that meaning is already taken." For blocking reads, reads that would block will, um, block.

(Side note, -EENDOFSTREAM might have been a better design decision but that's only first glance. If I ever get a time machine I'll take it up with the Multics guys.)

Does this make sense?

QuantumEntangledAndy commented 3 years ago

I read a bit about it and whether it returns wouldblock being a bit of an os dependant thing. Perhaps I hit a corner case where both clones of the socket (remember the try_clone before the polling thread) tried to do something at the same time. But because I was sending 100s of packets in quick succession it was repeatable.

I've also noticed something odd here which may be to do with what IoErrors my Mac sends:

When I'm working over a latent connection (SSH tunnel over cellular) I edit the code to bump up the timeout to a very large value (30mins) but I still get Read returned zero bytes. Perhaps there's another corner case where the socket can geniually return zero bytes?

thirtythreeforty commented 3 years ago

Yeah, I think it's still reasonable for you to get an EWOULDBLOCK.

I still get Read returned zero bytes

Ooh neat. I'd really like to see that in Wireshark. It could be something in the tunnel chain saying "screw this connection in particular, I don't want to keep track of it" and sending an RST.

Idle sockets are supposed to stay connected, that's like internet 101, but lots of things (NAT, VPNs, etc) monkey with this ideal.

It would be really interesting if you saw a zero-byte read without a corresponding RST.