jsdelivr / globalping

A global network of probes to run network tests like ping, traceroute and DNS resolve
https://globalping.io
270 stars 33 forks source link

Multi IP handling #527

Closed MartinKolarik closed 3 months ago

MartinKolarik commented 4 months ago

A side effect of #447 is that probes are now able to switch between multiple IP addresses based on network conditions. This may interfere with our adoption logic in two ways:

The existing UUIDs are not enough to solve this as they don't survive restarts. Several ideas for solving this:

We could also reduce the amount of switching by a node setting or entirely disable it and force IPv4 when supported but not sure if that's a good idea.

jimaek commented 4 months ago

have the probe report all of its IPs and identify them by "at least one IP matches" comparison - requires some dash and API changes but solves the problem and might be useful in general,

This is the best option for me. Simple and nice UX

MartinKolarik commented 4 months ago

Simple, not really. It's definitely the hardest of those options, but also the only one that covers all cases, so I'm leaning towards it as well. We might even combine it, as adding persistence to the UUIDs is a nice and easy improvement on top.

alexey-yarmosh commented 4 months ago

E.g. probe supports both IPv4 and IPv6. Yesterday it had IPv4 IP and was used for measurementA. Today it has IPv6 IP and was used for measurementB.

Now we are running two new measurements based on pervious: measurementA' and measurementB'. Ideally we want both new measurements refer to the probe. But since repeated measurements are filtering by IP, IPv4 result will be offline.

To fix this, storing of both IPs of the probe is required. So looks like it need to be implemented anyway.

MartinKolarik commented 4 months ago

Indeed, repeating measurements is also affected by this, but it's the same problem of probe "identity", the UUIDs could be used here as well.

alexey-yarmosh commented 4 months ago

UUIDs refer to device, for example laptop, which I may bring to office and back home every day and change the network. I think in repeating measurement we want to compare networks, which are identified by IPs.

MartinKolarik commented 4 months ago

That's an interesting thought because for the user, it's going to be the same probe in that case, but I agree for measurements, it makes sense to consider them as different.

MartinKolarik commented 4 months ago

Let's start with the first option here.

We need to ensure that the IPs we get actually belong to the probe, so the way I imagine it working is:

  1. After a probe is connected, the API sends some new type of message to the probe. The message includes some token that the API stores for 5 minutes (in redis?).
  2. The probe, after it receives this message, lists all of its non-internal IPs (1) and filters out the one that's used for the API connection. For each remaining IP, it makes a request to some new API endpoint using this IP (2) and includes the token.
  3. When the API receives the request, it looks up the probe based on the token and adds the IP making the request to the list of alternative IPs (3).

Notes:

  1. This is what we use in another project for similar purposes:
    _(os.networkInterfaces())
        .toPairs()
        .map(p => p[1])
        .flatten()
        .uniqBy('address')
        .filter(address => !address.internal)
        .map('address')
        .value()
  2. Via https://github.com/sindresorhus/got/blob/main/documentation/2-options.md#localaddress
  3. In the DB I guess the easiest way to handle this is a json field so that we can support more than 1 alternative IP. We need to track the lifetime of each IP separately - 30 days since it was last reported, same as for the primary IP.

Affected features:

MartinKolarik commented 4 months ago

We need to track the lifetime of each IP separately - 30 days since it was last reported, same as for the primary IP.

Actually, no, the alternative addresses should be tracked primarily on the API-side and be per connection, so after each new connection they start empty. If the probe is adopted, we should just clear the alternative IPs at that point and re-add if they are reported later.

@alexey-yarmosh ping me on Slack if I missed anything or something isn't clear.