lsankar4033 / stethoscope

Making sure Eth2 clients are breathing properly (network testing)
MIT License
10 stars 0 forks source link

Discovery handshake #19

Open jrhea opened 4 years ago

jrhea commented 4 years ago

Found a bug that caused the client to crash intermittently.

Issue: intermittent invalid Discv5 session causes panic - https://github.com/sigp/lighthouse/issues/1033 Fix: https://github.com/sigp/rust-libp2p/commit/46415e9467a42e70c6bb3d2525354e648867c6a3

This one is difficult to write a test for, but one idea would be to shuffle handshake message ordering and delay responses. Not sure what the test would be though - does it crash? Just adding it for the purpose of discussion.

lsankar4033 commented 4 years ago

Ah ya, I'd seen when you posted this issue and had been thinking about it.

As I understand it, this issue have to do with crossed streams during discovery due to messages dropping or network latency and retry logic. I think there are a bunch of scenarios that are worth putting into stethoscope that fall into that.

Need to put some more thought into how to make deterministic the dropping of messages b/w two nodes driven by rumor or prrkl.