Misplaced pullsync rate limiter?

ldeffenb commented 3 months ago

Context

2.0.0

Summary

While investigating a reduction in pullsync rates filling the reserve in a brand new node, I looked at the source and found the pullsync rate limiter. But I don't think it's actually doing what the comment and/or author thinks it is doing. https://github.com/ethersphere/bee/blob/501f8a4ef9caddd2d080829270a77cdd67cf1573/pkg/pullsync/pullsync.go#L259-L271

The bin in this case is the bin in the REMOTE peer, not the local peer. So the chunks being throttled across peers with a common bin number are not actually the "same" chunks, but chunks with a probable different prefix.

Expected behavior

I would expect a throttle on the actual chunks, not the random prefixes of remote peer bins.

Actual behavior

Consider syncing bin 12 from two peers with prefixes 0x108... and 0x10F... None of the chunks pulled from these same-numbered bins will actually match because their prefixes are in different ranges. Granted, this might work for lower numbered bins, but it imposes artificial locking in higher numbered bins where the actual chunk addresses will never match.

Steps to reproduce

This is a thought exercise.

Possible solution

Don't really know how to accomplish a rate limit on an actual shared range of chunk addresses when they're coming from different peer prefixes and bins...

ldeffenb commented 3 months ago

This limiter/lock was added 9 months ago: https://github.com/ethersphere/bee/commit/6fb07f335dd02e1118cab5cac40282b5aa5551b4

ldeffenb commented 3 months ago

Actually, I just had a thought. If the lock were on the AND of the bin expressed as binary ones vs the peer ID, then we'd actually be locking on the ACTUAL prefix of the chunks which would accomplish the comment-stated desire of the rate limiting lock.

For instance from my example above, bin 12 from two peers with prefixes 0x108... and 0x10F... would be individual locks on 0x108 and 0x10F.

But that wouldn't really work because we'd really need nested locks because a lock on 0x108 would also need to prevent bin 8 from a peer starting with 0x10... which is a bit more difficult, isn't it?

And taking that to it's logical conclusion, pulling bin 1 from any peer would be locking 1/2 of the swarm out from doing puts, but then, that's the stated intention, isn't it?

ldeffenb commented 1 month ago

It appears that the bin-locking code was removed in commit 52c2475 so this is no longer an issue. https://github.com/ethersphere/bee/blob/52c24758463e8c94d7421a3c46089a543b5e93f7/pkg/pullsync/pullsync.go#L255-L260

ethersphere / bee