Open acolytec3 opened 2 years ago
As a starting point, I prefer number 4 of these as it is far and away the least complex and seems like we could tackle this at some future time once we begin to observe routing table/network health and how hard it is for nodes to find a sufficient number of neighbors.
I'm curious what the implementation teams think about option 3 (using uTP) for this response when the response size exceeds the packet size. It seems like it should/could be cleaner than the messy logic of multiple disparate packets.
I'm also game to entertain option 4, but I'm a little concerned in imposing this limitation, though I also don't have a compelling reason to justify needing larger response sizes.
I like the option 3
as:
tldr: I'd be fine with using (optionally) uTP for a larger amount of ENRs.
I think the problems with the current version in spec (or solution 1. from above) are:
TALKRESP
are allowed on a TALKREQ
. This is for example explicitly stated for the NODES
message.TALKRESP
message currently.total
field and sending multiple reponses (mimicking the discv5 NODES behaviour). The Discovery v5 layer is however not aware of this and has no idea of how many NODES
message are supposed to arrive and thus how many it should accept (one could allow "any" amount of packets with a certain timeout, but that could then probably be abused)enrs
field in the message is defined as List[ByteList, 32]
, while 32 can never be reached, and this same limit cannot be applied on the end amount of ENRs as it is a limit imposed on the serialized field of the message, not the total over several messages.An adapted version of solution 1. would be to do the framing on discv5 level. Splitting the packet at layer discv5, keeping it as 1 big Portal message. This could then pack even 32 ENRs. However, I don't like this solution because:
total
was not reached)I agree also with the downsides mentioned for solution 2.
Solution 4. is what is done now in Fluffy, and it typically allows us to pack ~8 ENRs in the message, which is not great but it is sufficient (for now).
Conclusion: I'm also in favor of sending ENRs over uTP when the amount of ENRs can not be packed in a single discv5 talkresp message. Perhaps it should be left as an optional behaviour for a client to do.
Note however, that the same applies for sending ENRs back on a FindContent request. If we apply the same solution there (using uTP), there will probably be the need to discern the uTP data (content vs ENRs).
I haven't researched this in great detail yet but @ScottyPoi opened this research issue for Ultralight and I think we're starting to see some of the knock-on effects of effectively limiting FINDNODES/NODES to the current practical maximum of 8-9 ENRs. The short of it is that joining nodes aren't effectively unable to find a subset of other nodes despite actively looking since a given node may have a peer who is a bootnode who knows a broad swath of nodes in the network, the bootnode is limited to sharing 8-9 ENRs with the requesting node. As such, once the joining node makes its initial request to the bootnode to populate its routing table, it won't know to re-request nodes from the bootnode and will likely never get all of them anyway since the bootnode only sends the first 8 it can pull from its table at whatever distance (or set of requested distances).
Implementations should be able to quickly populate their routing tables even only 8-9 ENRs per response.
The NODES message in the Portal Network wire spec calls for a responding node to send multiple NODES messages when needing to send a number of ENRs to the requesting node that exceeds the byte size allowed by discv5 (1280 bytes) (follows the Discv5 wire spec on NODES messages). The challenge here is that Portal Network messages are included in a payload for the discv5 TALKREQ/TALKRESP message type and Discv5 wire spec only allows one TALKRESP per TALKREQ sent. As such, at least in the Javascript implementation of discv5, additional Portal Network NODES messages are dropped instead of decoded by discv5.
Several alternatives are available: 1) Add new logic at the discv5 layer to look for Portal Network NODES messages and handle them similarly to Discv5 NODES messages (i.e. determine how many will be sent and keep track of each response as it comes in).