Closed viraptor closed 3 years ago
Is this a known remote peer by any chance?
a9810a6b696e44b00ac332dc53f25bf68f86720d is the only occasion which could've introduced some wire encoding incompatibility. So this peer must be running a build from before that?
So looking through logs again, I noticed this happening from sprout. We should be both running the latest version of -bins
, I believe:
May 25 12:32:46 radicle-seed.viraptor.info radicle-seed-node[130268]: May 25 12:32:46.668 WARN accept{peer_id=hyy4ebh8amk3ypyawgpjnfpbgkxqwdo5n1iif5s17yghwmewrf4d8a}:incoming:incoming{remote_id=hynkyndc6w3p8urucakobzna7sxwgcqny7xxtw88dtx3pkf7m3nrzc remote_addr=34.91.134.19:12345}:incoming{remote_id=hynkyndc6w3p8urucakobzna7sxwgcqny7xxtw88dtx3pkf7m3nrzc remote_addr=34.91.134.19:12345}: librad::net::protocol::io::recv::gossip: gossip recv error err=Cbor(Decode(TypeMismatch(17, "expected array")))
May 25 12:33:08 radicle-seed.viraptor.info radicle-seed-node[130268]: May 25 12:33:08.985 WARN accept{peer_id=hyy4ebh8amk3ypyawgpjnfpbgkxqwdo5n1iif5s17yghwmewrf4d8a}:incoming:incoming{remote_id=hynkyndc6w3p8urucakobzna7sxwgcqny7xxtw88dtx3pkf7m3nrzc remote_addr=34.91.134.19:12345}:incoming{remote_id=hynkyndc6w3p8urucakobzna7sxwgcqny7xxtw88dtx3pkf7m3nrzc remote_addr=34.91.134.19:12345}: librad::net::protocol::io::recv::gossip: gossip recv error err=Cbor(Decode(TypeMismatch(17, "expected array")))
May 25 12:33:08 radicle-seed.viraptor.info radicle-seed-node[130268]: May 25 12:33:08.985 WARN accept{peer_id=hyy4ebh8amk3ypyawgpjnfpbgkxqwdo5n1iif5s17yghwmewrf4d8a}:incoming:incoming{remote_id=hynkyndc6w3p8urucakobzna7sxwgcqny7xxtw88dtx3pkf7m3nrzc remote_addr=34.91.134.19:12345}:incoming{remote_id=hynkyndc6w3p8urucakobzna7sxwgcqny7xxtw88dtx3pkf7m3nrzc remote_addr=34.91.134.19:12345}: librad::net::protocol::io::recv::gossip: gossip recv error err=Cbor(Decode(TypeMismatch(17, "expected array")))
May 25 12:33:08 radicle-seed.viraptor.info radicle-seed-node[130268]: May 25 12:33:08.985 WARN accept{peer_id=hyy4ebh8amk3ypyawgpjnfpbgkxqwdo5n1iif5s17yghwmewrf4d8a}:incoming:incoming{remote_id=hynkyndc6w3p8urucakobzna7sxwgcqny7xxtw88dtx3pkf7m3nrzc remote_addr=34.91.134.19:12345}:incoming{remote_id=hynkyndc6w3p8urucakobzna7sxwgcqny7xxtw88dtx3pkf7m3nrzc remote_addr=34.91.134.19:12345}: librad::net::protocol::io::recv::gossip: gossip recv error err=Cbor(Decode(TypeMismatch(114, "expected array")))
May 25 12:33:08 radicle-seed.viraptor.info radicle-seed-node[130268]: May 25 12:33:08.985 WARN accept{peer_id=hyy4ebh8amk3ypyawgpjnfpbgkxqwdo5n1iif5s17yghwmewrf4d8a}:incoming:incoming{remote_id=hynkyndc6w3p8urucakobzna7sxwgcqny7xxtw88dtx3pkf7m3nrzc remote_addr=34.91.134.19:12345}:incoming{remote_id=hynkyndc6w3p8urucakobzna7sxwgcqny7xxtw88dtx3pkf7m3nrzc remote_addr=34.91.134.19:12345}: librad::net::protocol::io::recv::gossip: gossip recv error err=Cbor(Decode(TypeMismatch(114, "expected array")))
May 25 12:33:17 radicle-seed.viraptor.info radicle-seed-node[130268]: May 25 12:33:17.192 WARN accept{peer_id=hyy4ebh8amk3ypyawgpjnfpbgkxqwdo5n1iif5s17yghwmewrf4d8a}:incoming:incoming{remote_id=hynkyndc6w3p8urucakobzna7sxwgcqny7xxtw88dtx3pkf7m3nrzc remote_addr=34.91.134.19:12345}:incoming{remote_id=hynkyndc6w3p8urucakobzna7sxwgcqny7xxtw88dtx3pkf7m3nrzc remote_addr=34.91.134.19:12345}: librad::net::protocol::io::recv::gossip: gossip recv error err=Cbor(Decode(TypeMismatch(114, "expected array")))
May 25 12:33:17 radicle-seed.viraptor.info radicle-seed-node[130268]: May 25 12:33:17.192 WARN accept{peer_id=hyy4ebh8amk3ypyawgpjnfpbgkxqwdo5n1iif5s17yghwmewrf4d8a}:incoming:incoming{remote_id=hynkyndc6w3p8urucakobzna7sxwgcqny7xxtw88dtx3pkf7m3nrzc remote_addr=34.91.134.19:12345}:incoming{remote_id=hynkyndc6w3p8urucakobzna7sxwgcqny7xxtw88dtx3pkf7m3nrzc remote_addr=34.91.134.19:12345}: librad::net::protocol::io::recv::gossip: gossip recv error err=Cbor(Decode(TypeMismatch(114, "expected array")))
I suspect that the decoder somehow receives an EOF before the whole CBOR message is transmitted. Those type mismatches are basically random.
:( Would it be worth adding a message length ahead of it, to tell a difference between "incomplete read" and "invalid message"?
Ya, although that would be a breaking change. RFC 8949 explicitly says that back-to-back encoding should be preferred — it’s more of a limitation of Rust idioms which propagates the wrong error message here.
More interesting would be to find out how such a partial transmission can happen in the first place. It must be that the sender is actually shutting down, and discards outstanding packets. I’m aware that we should actually call Endpoint::wait_idle
, but couldn’t find a good place to call it (it would need to be in main
)
It must be that the sender is actually shutting down
With the timing (few times in a second, then 9s break), is that likely a shutdown? Actually you can probably verify the restart in sprout logs.
I'm seeing this error after updating to 0.2. Happened only 2 times so far:
librad::net::protocol::io::recv::gossip: gossip recv error err=Cbor(Decode(TypeMismatch(114, "expected array")))
longer log chunk:
I don't see any obvious way to reproduce this. Can I provide any more detailed logs to help narrow this down?