Closed mfogelman closed 9 years ago
There is no reason for the exact number.
The reason for choosing a low number is the necessity for (secondary) throttling. Having many tasks active at once just causes tasks to compete for RPC slots (the primary throttling mechanism), which makes them take longer and could lead to token
timeouts if congestion became extreme.
You also should take into account that this limit is multiplied by the number of active RPCServer
instances, i.e. on a server-class machine with multi-homing it gets scaled up.
On the other hand cheap NAT devices - which you generally find in a home environment - would just get overloaded with too much UDP traffic anyway.
Do you have observed any problems?
Hi, thanks a lot for the response!!
I didn't see any issues with that, only that when finding peers for several hashes is the primary usage of the library, it looks like it can take more tasks at the same time... I tried multiplying those numbers and got better and better results running on a server. I'll continue testing it.
Thanks a lot again! Regards, Martin
If you're only interested in getting peers - as opposed to announcing - you can set PeerLookupTask#setFastTerminate(true)
, this will allow those tasks to terminate based on stall timeouts (based on connection latency) instead of hard timeouts (10s).
Additionally setLowPriority(true)
might actually improve performance if you issue many tasks at once. Since it will decreases parallelism per task it will allow increased parallelism between multiple tasks.
Thank you so much! What you suggested is really great! It's flying now.
One last question: do you have any idea of what's the percentage of seeders found during a PeerLookupTask run and if there's a way to maximize it, regardless of the time that the task can take?
Seeds specifically or any peer regardless of completion status?
Any peer, regardless of completion status...
Hrrm... for large swarms with many peers it might be possible to get a few additional ones by simply running the lookup again since each response might contain a randomly sampled subset.
But I think the best way to get a good view of the swarm is to connect to them with the bittorrent protocol and use PEX.
Hi, how are you doing?
May I ask you why the TaskManager would run up to 7 (MAX_ACTIVE_TASKS) tasks in parallel and enqueue the following? What's the constraint that took you to set it that way and not 8, 20 or 100 instead?
Thanks a lot in advance! Regards, Martin