InternetHealthReport / internet-yellow-pages

A knowledge graph for the Internet
https://iyp.iijlab.net
GNU General Public License v3.0
43 stars 18 forks source link

Update Atlas probe crawler to fetch all probes #93

Closed m-appel closed 10 months ago

m-appel commented 10 months ago

For the planned measurement crawler it is required to have more than only the connected probes in the graph. Especially long-running measurements can contain disconnected probes, but we might still want to model them.

Note that this crawler can now create dangling nodes, e.g., there is a status "Never Connected" where the probes have no IP/ASN/country. But instead of arbitrarily deciding what to include and what not, we just fetch all (public) probes instead, since the number is not very large.

This commit also updates the IPv6 handling to guarantee a canonical form.

How Has This Been Tested?

Deleted all Atlas probes from a current dump and ran the crawler.

Types of changes

Checklist: