greatest-ape / aquatic

High-performance open BitTorrent tracker (UDP, HTTP, WebTorrent)
Apache License 2.0
479 stars 33 forks source link

Consider using HMAC for stateless connection id handling #72

Closed shyba closed 2 years ago

shyba commented 2 years ago

Hi, thanks for this great project!

While benchmarking the UDP implementation against chihaya I noticed some interesting results and decided to dig deeper. Mainly, the connect response handling looks very different. (please ignore raw results as this is a Qubes VM in a notebook. The interesting part is more about the magnitude and behavior over time)

The reason for that difference in connect handling seems to be related to stateless handling of connection ids through HMAC. Something done here: https://github.com/chihaya/chihaya/blob/828edb8fd8bea3e86ba53fd39aa9a8b89c95a781/frontend/udp/connection_id.go

Initially, aquatic starts with a very high throughput for announces but low for connect.

Requests out: 154357.21/second          
Responses in: 154132.02/second                                                                
  - Connect responses:  221.60    
  - Announce responses: 128203.99                          
  - Scrape responses:   25706.44                                     
  - Error responses:    0.00                                   
Peers per announce response: 5.01   

Then, over some time, the number of peers finally gets closer to the 50 default limit while throughput drops as expected:

Requests out: 24783.82/second
Responses in: 24723.03/second
  - Connect responses:  60.79
  - Announce responses: 20552.83
  - Scrape responses:   4109.41
  - Error responses:    0.00
Peers per announce response: 45.61

chihaya starts with a higher connect handling, which makes it bring way more peers.

Requests out: 42182.74/second
Responses in: 33390.18/second
  - Connect responses:  6619.61
  - Announce responses: 22347.70
  - Scrape responses:   4422.87
  - Error responses:    0.00
Peers per announce response: 61.34

Requests out: 34582.55/second
Responses in: 27301.13/second
  - Connect responses:  5521.26
  - Announce responses: 18218.75
  - Scrape responses:   3561.11
  - Error responses:    0.00
Peers per announce response: 84.60

I believe making it stateless should help aquatic getting some nice boost in connect response handling. Hopefully reducing memory usage for millions of peers too, given we can remove ConnectionMap.

greatest-ape commented 2 years ago

Thanks, this is interesting.

I think using timestamp + truncated hmac for connection IDs is worth exploring just to reduce memory usage and avoid GC pauses, as long as using only 32 bits of hmac is acceptable. Thanks for bringing it to my attention.

However, I don’t really see how it could account for difference between aquatic and chihaya in pattern of connect/announce responses. The current hashmap lookup method is ridiculously fast. I will have to do some probing myself of what could cause those differences.

greatest-ape commented 2 years ago

I’ve replaced ConnectionMap with a BLAKE3 MAC-based ConnectionValidator :-)

When it comes to the aquatic/chihaya differences, I haven’t looked at them, but it’s worth mentioning that connect requests by themselves do not add peers to the swarm.

lorislibralato commented 9 months ago

With the current implementation a malicious user can impersonate another user via udp spoofing and bruteforce of the start time

greatest-ape commented 9 months ago

With the current implementation a malicious user can impersonate another user via udp spoofing and bruteforce of the start time

@lorislibralato Could you please elaborate on the exact steps an exploit would consist of?

In my understanding, an attacker would need to make on average 2147483647 attempts of sending fake announce requests to successfully guess the last 4 bytes of a single hash. Is this the attack you’re thinking of?

I have some ideas in mind to harden this further.

lorislibralato commented 9 months ago

My bad, I didn't saw that you use hmac and not a simple hashing algorithm