InternetHealthReport / internet-yellow-pages

A knowledge graph for Internet resources
GNU General Public License v3.0
39 stars 16 forks source link

bgp.tools AS names #133

Closed romain-fontugne closed 6 months ago

romain-fontugne commented 6 months ago

Describe the bug BGP.tools has changed their csv where we get AS names from (https://bgp.tools/asns.csv). It now include a class that give the type of AS (e.g. eyeball, content, carrier). So now we end up with name like name: Internet Initiative Japan Inc.,Carrier instead of name: Internet Initiative Japan Inc..

To Reproduce See: MATCH p = (:AS {asn:2497})-[:NAME {reference_name: 'bgptools.as_names'}]-(:Name) RETURN p

Expected behavior We should ignore the last column in the csv file. Actually that script may need refactoring and use the header line to avoid such problem in the future.

m-appel commented 6 months ago

I was going to propose just using

pandas.read_csv('https://bgp.tools/asns.csv')

But sadly that does not work, since we have to set the user agent, else we get an error 403. But we can still just use pandas to read the CSV, after retrieving it.

romain-fontugne commented 6 months ago

yes, that would simplify the way we parse the csv file. iirc the code looks a bit funky because we could have commas in the names hopefully pandas will get this right.