InternetHealthReport / internet-yellow-pages

A knowledge graph for the Internet
https://iyp.iijlab.net
GNU General Public License v3.0
43 stars 18 forks source link

Atlas measurements #81

Closed romain-fontugne closed 10 months ago

romain-fontugne commented 11 months ago

Get the currently running RIPE Atlas measurement (https://atlas.ripe.net/api/v2/measurements/) with the list of probes.

Add the following relationships:

(:AtlasProbe)-[:PART_OF]->(:AtlasMeasurement)-[:TARGET]->(:DomainName)

and

(:AtlasProbe)-[:PART_OF]->(:AtlasMeasurement)-[:TARGET]->(:IP)
mohamedawnallah commented 11 months ago

I’m gonna work on this issue :)

m-appel commented 11 months ago

Thanks! To spare you some work, this is the actual API request, Romain posted only the light version:

https://atlas.ripe.net/api/v2/measurements/?is_public=true&status=2&optional_fields=current_probes

This returns only active measurements and includes the ids of the participating probes. See also the reference.

mohamedawnallah commented 11 months ago

Okay, We're currently dealing with four main target entities within the RIPE Atlas measurement dataset:

Two main questions arise:

1) Should relationships involving the AtlasMeasurement entity extend beyond IP and DomainName entities to include Prefix and AS entities?

2) When encountering an invalid domain name, similar to the one shown in the screenshot, how should we address this? Should we handle the situation atomically by discarding the entire data point (including the relationship with IP) if the DomainName is invalid? Or if there is an internal tool that converts an IP address to a domain name similar to ip-to-hostname tool ? Alternatively, is it acceptable for an AtlasMeasurement node to have for example a relationship with IP but not with an equivalent DomainName, allowing for a more graceful resolution of this inconsistency?

The screenshot provides an overview of RIPE Atlas measurements, the specific data point of the screenshoot accessible here: RIPE Atlas Measurements

2023-12-13_15-39

m-appel commented 11 months ago

With RIPE Atlas you can target either an IP directly, or specify a hostname that is resolved to one (or more) IPs by the system. So target can indeed be either of these cases, like you mentioned.

Both are valid, so in the example above you would only create a link to IP without any domain name. Btw. there is also a resolved_ips list that might be preferred over the target_ip field. I believe this can contain multiple IPs if a domain name was specified as the target, but I'm not sure. In this case we want to add links to all of them.

Long story short: Always add target_ip (or resolved_ips) and add target if it's a domain name. We do not need target_asn and target_prefix for now, since we have this connection already in our database (and technically the measurement is not really targeting an AS or prefix, but a specific host within them).