uber / kraken

P2P Docker registry capable of distributing TBs of data in seconds
Apache License 2.0
6.14k stars 423 forks source link

Does Kraken support P2P Peer Optimal Algorithm based on network topology awareness? #244

Closed duyanghao closed 4 years ago

duyanghao commented 4 years ago

I am curious about whether Kraken supports P2P Peer Optimal Algorithm, guess Peer1 can get data from neither Peer2 or Peer3, and Peer2 is closer to Peer1 with regard to network topology, will Peer1 finally choose the Peer2 instead of Peer3?

yiranwang52 commented 4 years ago

Kraken is not aware of underlying network topology as of today.

We tried a simple implementation in early days, but that had severe negative impact on performance. My theory is - assume group A and group B are far way, and both want the same image. Group A started downloading a couple seconds earlier, and occupied all open connections on origin cluster. Group B peers cannot reach origin, and were not recommended to talk to group A, so they just connect among themselves, and cannot get any data. Group B peers were only able to make progress after most of group A finished downloading.

Then we opt for having separate clusters in each physical zone, and let them share storage or setup replication.

If you are curious, it's easy to extend tracker implementation by adding new policies, or write some simple simulations.

duyanghao commented 4 years ago

Kraken is aware of underlying network topology as of today.

We tried a simple implementation in early days, but that had severe negative impact on performance. My theory is - assume group A and group B are far way, and both want the same image. Group A started downloading a couple seconds earlier, and occupied all open connections on origin cluster. Group B peers cannot reach origin, and were not recommended to talk to group A, so they just connect among themselves, and cannot get any data. Group B peers were only able to make progress after most of group A finished downloading.

Then we opt for having separate clusters in each physical zone, and let them share storage or setup replication.

If you are curious, it's easy to extend tracker implementation by adding new policies, or write some simple simulations.

@yiranwang52 Thanks for your reply, does it mean that Kraken's support for this feature is not good enough?

yiranwang52 commented 4 years ago

Currently kraken doesn't support this at all. Since our simple approach didn't work, we removed the code.

duyanghao commented 4 years ago

@yiranwang52 What's the plan for this feature? Or doesn't kraken want to support it at all? What do you think of this feature?

yiranwang52 commented 4 years ago

Uber probably wouldn't need this, given how we design zones and regions. If any contributor can make it work as an option with convincing data, then we will glad to merge it in.

duyanghao commented 4 years ago

@yiranwang52 Thanks for your reply. Close this issue since now I have no idea about this.