InternetHealthReport / ihr-website

Vue.js code for IHR website
https://ihr.iijlab.net/ihr/en
GNU General Public License v3.0
65 stars 122 forks source link

Enhancement for `asnames.txt` file to Capture All ASes Encountered in the Global Report #725

Closed mohamedawnallah closed 10 months ago

mohamedawnallah commented 11 months ago

Is your feature request related to a problem? Please describe.

Currently ripe_asnames.txt doesn't contain all the ASes we encounter while aggregating the alarms in the Global Report this means we will discard the missing ASes which may contain valuable insights.

Describe the solution you'd like

We currently log missing ASes in the console, similar to the attached screenshot. Our Continuous Integration (CI) includes a test AsNames.test.js that checks the validation of the new entries in asnames.txt. It checks for a pattern: <AS_NUMBER><SPACE><AS_NAME><COMMA><SPACE><COUNTRYCODE> (e.g., 15176 AS-INOC, US). We propose implementing a review step for the meaning of new entries added:

  1. Contributors submit a PR to add an entry in the asnames.txt file.
  2. The CI checks for the specified pattern's correctness.
  3. A reviewer ensures the semantics of the added content.

Please let me know if this issue makes sense I'd love to submit a document on how to add an AS in our file given that we're agreed on the data sources priority of getting those AS information.

Screenshoots

For example, here we ignored 17 alarms in AS151716 because it was missing in our file

Screenshot 2023-12-31 at 10 30 09 PM

Describe alternatives you've considered

We may use the Network IHR endpoint to get those missing ASes but It would create overhead/latency with little value in return it has also been discussed here: https://github.com/orgs/InternetHealthReport/discussions/14#discussioncomment-7144085

Additional context

Explored several websites for AS information. Unsure about reliability, considering the order of reliability and true positive rates:

  1. Stat.RIPE
  2. IPinfo.io - Query example: https://ipinfo.io/AS151716
  3. ASRank - Query example: https://asrank.caida.org/asns?asn=AS151716&type=search

Seeking input on the reliability and prioritization of these data sources for AS information.

dpgiakatos commented 11 months ago

I think we can replace the asnames.txt with the IYP API after the merge of PR #719. What are your thoughts on this approach?

mohamedawnallah commented 11 months ago

It seems these requests are still external network requests, right? If that's the case, I'm concerned about the potential for a high volume of network requests. Considering the numerous alarms we've observed in the global report—on January 1, 2024, for instance, a three-day report extracted approximately 10,300 alarms from the Hegemony alarm type alone, not to mention the additional 10 alarm types—we might encounter a substantial number of network requests. Furthermore, even if the API supports batching multiple ASes in a single request, it could still result in a considerable volume. Additionally, IYP internally still utilizes the same asnames.txt from RIPE, as extracted in the as_names crawler.

dpgiakatos commented 11 months ago

The API supports batching multiple ASes in a single request, but you are correct. Let's leave it as it is, with the external file. Thank you!