massimocandela / geofeed-finder

Utility to find geofeed files linked from rpsl.
BSD 3-Clause "New" or "Revised" License
76 stars 7 forks source link

Keeping the feed files separate #10

Closed sid6mathur closed 2 years ago

sid6mathur commented 2 years ago

Wonderful tool, thank you! :)

Newbie question: Is there a way to avoid fetch+merge all the feed contents discovered by the tool, and limit it to only dump the list of URLs discovered? Ideally, keyed by the subnet/org name/ASN/RDAP URL

For example, if the tool discovered "https://ip-geolocation.fastly.com" as a CSV feed URL, it would be nice to be able to get context as to who contributed that file - say "Fastly Inc." as described in the organization name field of a particular RDAP object.

Thank you!

paulgao commented 2 years ago

I have the same functional requirement, just a list of geofeeds urls is fine.

randyqx commented 2 years ago

would you please specify how, programatically, one would get from https://ip-geolocation.fastly.com/ to Fastly Inc.?

massimocandela commented 2 years ago

just a list of geofeeds urls is fine.

Getting a list of geofeed urls is just bulk whois data parsing, there are other generic tools that can be used for that (including a simple grep).

However, that is a bad idea: a geofeed file can contain whatever prefix, including prefixes not belonging to the owner of the file. Geolocation providers simply importing geofeed files are setting themself up for a huge vulnerability. This tool follows RFC9092 to validate ownership of the resources based on the inetnum from where the file is linked. The only way to do that is to parse the prefixes in the geofeed file in parallel with the inetnum. I prefer to avoid to promote unsafe way to process the data.

sid6mathur commented 2 years ago

Thanks Massimo. I agree the verification of ownership is key - it's also mentioned in RFC8805.

About an intermediate representation file ("debug file") that captures that an inetnum or inet6num keyed set of properties, would that be OK to add? Use case : the tool user wishes to run a second pass on the RDAP URL of the subnet (key) to retreive additional properties.

massimocandela commented 2 years ago

Hi @s8mathur,

You can directly use bulk-whois-parser, which I'm also using in this project to do an intermediate step similar to what you are asking.

https://github.com/massimocandela/geofeed-finder/blob/c15ae2491b174f6d5329e5a12c219c93661b3f4e/src/finder.js#L46.