InternetHealthReport / internet-yellow-pages

A knowledge graph for Internet resources
GNU General Public License v3.0
39 stars 16 forks source link

Add Alice-LG crawler #99

Closed m-appel closed 9 months ago

m-appel commented 9 months ago

This crawler imports memership and optionally routing information from IXPs that use the Alice-LG looking glass.

Description

The Alice-LG looking glass is used by some of the largest IXPs and provides a good opportunity to get additional membership information via a single crawler.

Motivation and Context

Currently we only get IXP membership information from PeeringDB. However, this data is usually not updated automatically and might be outdated. Getting the membership information directly from the looking glass, i.e., the route servers at the IXP, provides a data-driven view of the IXP members.

This approach might miss IXP members that do not peer with the route server.

Originally we planned to get routing information from the members as well. However, testing has shown that this takes too long for most IXPs due to the small pagination size of the API. Furthermore, we already import route collector data from PCH, which is already present at the largest IXPs.

Therefore, the functionality is included, but disabled by default.

Closes #41.

How Has This Been Tested?

Ran each crawler on its own and ran a full test including routing data for BCIX.

Types of changes

Checklist:

romain-fontugne commented 9 months ago

Small detail, I like to keep the cloudflare crawlers at the end of the list of crawlers in the config file. These are by far the slowest crawlers we have and running them at the end has two advantages:

  1. detect early if one of the other crawler crashes
  2. the other crawlers can fetch data within the same few hours, and if we do cloudflare in the middle the last crawlers will fetch data on the following day.
romain-fontugne commented 9 months ago

thanks!!