Open jrhea opened 4 years ago
Ah ya, I'd seen when you posted this issue and had been thinking about it.
As I understand it, this issue have to do with crossed streams during discovery due to messages dropping or network latency and retry logic. I think there are a bunch of scenarios that are worth putting into stethoscope that fall into that.
Need to put some more thought into how to make deterministic the dropping of messages b/w two nodes driven by rumor or prrkl.
Found a bug that caused the client to crash intermittently.
Issue: intermittent invalid Discv5 session causes panic - https://github.com/sigp/lighthouse/issues/1033 Fix: https://github.com/sigp/rust-libp2p/commit/46415e9467a42e70c6bb3d2525354e648867c6a3
This one is difficult to write a test for, but one idea would be to shuffle handshake message ordering and delay responses. Not sure what the test would be though - does it crash? Just adding it for the purpose of discussion.