Closed int08h closed 1 year ago
Backtrace of bug
Nov 27 23:45:15 roughenough-1f run_server.bash[3662]: 1: core::panicking::panic_fmt
Nov 27 23:45:15 roughenough-1f run_server.bash[3662]: at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/core/src/panicking.rs:100:14
Nov 27 23:45:15 roughenough-1f run_server.bash[3662]: 2: core::result::unwrap_failed
Nov 27 23:45:15 roughenough-1f run_server.bash[3662]: at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/core/src/result.rs:1616:5
Nov 27 23:45:15 roughenough-1f run_server.bash[3662]: 3: roughenough::responder::Responder::send_responses
Nov 27 23:45:15 roughenough-1f run_server.bash[3662]: 4: roughenough::server::Server::process_events
Nov 27 23:45:15 roughenough-1f run_server.bash[3662]: 5: roughenough_server::main
Fix has been deployed
The problem was with the way responses were sent to clients. This is the problematic block in responder.rs
let bytes_sent = socket
.send_to(&resp_bytes, &src_addr)
.expect("send_to failed");
Roughtime's server is non-blocking (async) built on Mio. The client socket
here is a mio::net::UdpSocket
and is attempting to send (send_to()
) a reply (&resp_bytes
) to a client (&src_addr
).
The EAGAIN/EWOULDBLOCK
response from send_to()
is Linux telling us "resources are full, try again later". But as you can see in the snippet, there is no error handling of the results from calling send_to()
. There is no reattempt. Instead we get a runtime panic.
The "fix" is a band-aid: check the return value of send_to
, bumping a counter on any errors, but otherwise ignore errors from send_to()
.
One might think "a correct fix is to re-attempt delivery". That's probably correct but the retry logic must ensure that the MIDP
time of the in-flight response s still within the uncertainty RADI
.
Full log message was:
thread 'main' panicked at 'send_to failed: Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }', src/responder.rs:114:18
Seen on
roughtime.int08h.com