ethereum / hive

Ethereum end-to-end test harness
GNU General Public License v3.0
383 stars 437 forks source link

Revert pull/1128 #1129

Closed emhane closed 5 days ago

emhane commented 5 days ago

Reverts https://github.com/ethereum/hive/pull/1128

Reth is running discv5 as an additional service, with its own port. https://github.com/ethereum/hive/issues/1073 needs another solution than using same port.

emhane commented 5 days ago

we can disable discv4 for discv5 tests, not sure how best way to do that is - make a second client reth-discv5, or change the discv5 tests to try port 9000 if client is reth?

fjl commented 5 days ago

So reth can't have both on the same port? It's kind of required to have both discv4 and discv5 on the same port, otherwise you can't really participate in mainnet with both. ENR can only have one UDP port.

emhane commented 5 days ago

So reth can't have both on the same port? It's kind of required to have both on the same port, otherwise you can't really participate in mainnet with both. ENR can only have one UDP port.

Each discovery service has its own ENR

fjl commented 5 days ago

I'm merging the revert for now if it helps you with test failures.

fjl commented 5 days ago

Each discovery service has its own ENR

Each with its own node ID as well?

emhane commented 5 days ago

Each discovery service has its own ENR

Each with its own node ID as well?

no same node id.

reth's discv5 default port is same as CL default. majority of nodes running discv5, are CL. in the same sense like http doesn't change port based on which app is using the protocol, it makes sense that discv5 doesn't change default port based on whether EL or CL is using it.

fjl commented 5 days ago

So you're basically publishing two different ENRs for the same node ID? That's not great. The point of ENR is being interchangeable between different discovery mechanisms. And it can cause problems, like if a client has a global node cache, the different ENRs would overwrite each other. Can you please make it supported to run both on the same UDP port?

Regarding the specific port number, I don't think it's important. Nowhere in any document do we prescribe a default port number to use. You can run it on any port. CL chose 9000 for some reason, but that doesn't mean EL can't run discv5 on 30303.

fjl commented 5 days ago

in the same sense like http doesn't change port based on which app is using the protocol,

It's different for HTTP because it has a default port. If you specify a http URL without an explicit port, then port 80 is assumed. However, this cannot occur for discv5 since it uses ENR as the 'URL', and there will always be a port included there. If you don't put a port into the ENR, the node should be considered not connectable by discovery.

emhane commented 5 days ago

So you're basically publishing two different ENRs for the same node ID? That's not great. The point of ENR is being interchangeable between different discovery mechanisms. And it can cause problems, like if a client has a global node cache, the different ENRs would overwrite each other. Can you please make it supported to run both on the same UDP port?

it's not that simple since we are running sigp/discv5 as an external library. sigp/discv5 offers support for parallel ipv6 + ipv4 and have double receive buffers for better performance. filtering traffic before that would disable this perf boost. since discv4 can eventually be replaced with discv5, there isn't much reason to focus efforts on further developing features for the older version atm. the only problem that is definitely being caused by not using the same port is in hivetests. the tcp port is the same in both discv4 and discv5 ENRs, so on restarting the node, rlpx will reconnect regardless.

fjl commented 5 days ago

In that case, let's solve it by adding a flag like HIVE_EL_DISCV5 that we set in this suite. Then disable v4 and enable v5 in scripts when the variable is set.

fjl commented 5 days ago

I still think you should eventually add support for multiplexing different protocols on the same port alongside discv5. That's something that could be implemented in sigp/discv5 crate. You'll need that if you ever want to run QUIC on the same port as discovery, or if we ever add more UDP protocols into the mix.

The idea with multiplexing everything on one port is that the NAT hole punching can be shared, among other things. Creating a working NAT hole for one port is hard enough, why should every protocol perform its own mapping.

emhane commented 5 days ago

I still think you should eventually add support for multiplexing different protocols on the same port alongside discv5. That's something that could be implemented in sigp/discv5 crate. You'll need that if you ever want to run QUIC on the same port as discovery, or if we ever add more UDP protocols into the mix.

The idea with multiplexing everything on one port is that the NAT hole punching can be shared, among other things. Creating a working NAT hole for one port is hard enough, why should every protocol perform its own mapping.

yeah, it's a good idea eventually, but until RLPx is NAT hole punching, is quite a bit to go. not sure if all discv5 impls have implemented hole punching yet to start with. lighthouse is using another port for QUIC, so should be fine for now https://github.com/sigp/lighthouse/blob/f1d88ba4b1256e084d3a2a547e0ce574b896246c/beacon_node/lighthouse_network/src/config.rs#L20-L21.

fjl commented 5 days ago

lighthouse is using another port for QUIC, so should be fine for now

Sorry to turn this into discussion. But now I am curious how the QUIC port gets relayed in discovery. Since the ENR can only have one UPD port, how do you tell your peers about the QUIC port number?

emhane commented 5 days ago

Sorry to turn this into discussion. But now I am curious how the QUIC port gets relayed in discovery. Since the ENR can only have one UPD port, how do you tell your peers about the QUIC port number?

oh no don't worry, appreciate the curiosity :) I've seen enrs with a "quic" and "quic6" kv pair