Closed ldeffenb closed 1 month ago
This limiter/lock was added 9 months ago: https://github.com/ethersphere/bee/commit/6fb07f335dd02e1118cab5cac40282b5aa5551b4
Actually, I just had a thought. If the lock were on the AND of the bin expressed as binary ones vs the peer ID, then we'd actually be locking on the ACTUAL prefix of the chunks which would accomplish the comment-stated desire of the rate limiting lock.
For instance from my example above, bin 12 from two peers with prefixes 0x108... and 0x10F... would be individual locks on 0x108 and 0x10F.
But that wouldn't really work because we'd really need nested locks because a lock on 0x108 would also need to prevent bin 8 from a peer starting with 0x10... which is a bit more difficult, isn't it?
And taking that to it's logical conclusion, pulling bin 1 from any peer would be locking 1/2 of the swarm out from doing puts, but then, that's the stated intention, isn't it?
It appears that the bin-locking code was removed in commit 52c2475 so this is no longer an issue. https://github.com/ethersphere/bee/blob/52c24758463e8c94d7421a3c46089a543b5e93f7/pkg/pullsync/pullsync.go#L255-L260
Context
2.0.0
Summary
While investigating a reduction in pullsync rates filling the reserve in a brand new node, I looked at the source and found the pullsync rate limiter. But I don't think it's actually doing what the comment and/or author thinks it is doing. https://github.com/ethersphere/bee/blob/501f8a4ef9caddd2d080829270a77cdd67cf1573/pkg/pullsync/pullsync.go#L259-L271
The bin in this case is the bin in the REMOTE peer, not the local peer. So the chunks being throttled across peers with a common bin number are not actually the "same" chunks, but chunks with a probable different prefix.
Expected behavior
I would expect a throttle on the actual chunks, not the random prefixes of remote peer bins.
Actual behavior
Consider syncing bin 12 from two peers with prefixes 0x108... and 0x10F... None of the chunks pulled from these same-numbered bins will actually match because their prefixes are in different ranges. Granted, this might work for lower numbered bins, but it imposes artificial locking in higher numbered bins where the actual chunk addresses will never match.
Steps to reproduce
This is a thought exercise.
Possible solution
Don't really know how to accomplish a rate limit on an actual shared range of chunk addresses when they're coming from different peer prefixes and bins...