probe-lab / network-measurements

MIT License
50 stars 13 forks source link

DHT Lookup Latency Increase since mid-June 2023 #56

Closed yiannisbot closed 1 year ago

yiannisbot commented 1 year ago

Context

We've been observing a slight increase in the DHT Lookup Latency since around the mid of June 2023. The increase is in the order of ~10% and is captured in our measurement plots at: https://probelab.io/ipfskpi/#dht-lookup-performance-long-plot. This is a tracking issue to identify the cause of the latency increase.

Evidence

Below the short-term latency graph (https://probelab.io/ipfsdht/#dht-lookup-performance-overall-plot):

Screenshot 2023-07-28 at 8 43 07 AM

Observing the CDFs of the DHT Lookup latency across different regions over time, we see a clear move towards the right of the plot for several regions, most notably for eu-central, but also ap-south-1 and also af-south-1 (in Week 27).

Week 24 (2023-06-12/18) https://github.com/plprobelab/network-measurements/tree/master/reports/2023/calendar-week-24/ipfs#dht-performance

DHT-lookup-week-24

Week 25 (2023-06-19-25) https://github.com/plprobelab/network-measurements/tree/master/reports/2023/calendar-week-25/ipfs#dht-performance

DHT-lookup-week-25

Week 26 (2023-06-26 - 2023-07-02) https://github.com/plprobelab/network-measurements/tree/master/reports/2023/calendar-week-26/ipfs#dht-performance

DHT-lookup-week-26

Week 27 (2023-07-03/09) https://github.com/plprobelab/network-measurements/tree/master/reports/2023/calendar-week-27/ipfs#dht-performance

DHT-lookup-week-27

Thoughts

The latency seems to be heading back down, but we're not sure if there's a specific reason for this behaviour. Some thoughts:

Screenshot 2023-07-28 at 9 04 54 AM

Any other thoughts @Jorropo @aschmahmann @lidel @hacdias ?

yiannisbot commented 1 year ago

The DHT Lookup Latency seems to have gone down and although I'm not terribly proud that we didn't have time look into this in more detail, I'll close this issue, as it doesn't seem like an alarming case.

One observation though which might explain the situation is that DHT Lookup Latency seems to be going up when we have lots of peers that show as "Offline" at: https://probelab.io/ipfsdht/#dht-availability-classified-overall-plot (related issue: https://github.com/plprobelab/network-measurements/issues/57).

Putting the plots side by side, we see that when we have lots of peers that appear offline (seen for less than 10% of the time), latency goes up (red rectangles), while the opposite happens when the number of offline peers goes down (green rectangle).

dht-availability-classified-overall

dht-lookup-performance-overall

As discussed in: https://github.com/plprobelab/network-measurements/issues/57 this is an issue worth looking into in more detail.