massimocandela / geofeed-finder

Utility to find geofeed files linked from rpsl.
BSD 3-Clause "New" or "Revised" License
74 stars 8 forks source link

v1.5 crashes when parsing the latest apnic data #18

Closed sgteq closed 2 years ago

sgteq commented 2 years ago

Running geofeed-finder-linux-x64-1.5 -i apnic or without "-i apnic" crashes. The last two lines of output:

inetnum: 78.142.246.0/23 https://ns1.vpshispeed.com/geofeed.csv [cache]
Illegal argument undefined

The resulting "result.csv" is not created. A week ago it was running fine. The crash is caused by the new apnic data. Deleting the cache doesn't help.

massimocandela commented 2 years ago

Hi @sgteq,

Thanks a lot for reporting this! To improve performance, I was first filtering remarks starting with the word Geofeed and after properly validating the associated url. However, I was not handling properly the case when somebody adds a remark Geofeed not followed by the url. This happened 3 days ago with an inetnum in apnic.

This is now fixed and released in v1.5.1. Sorry for the inconvenience.

sgteq commented 2 years ago

Thanks for the quick fix @massimocandela ! The issue is fixed indeed but I discovered a side effect of v1.5.1 release -- it finds 53 fewer feed URLs, a significant drop.

all-2022-08-30.log: 299 URLs 4,166,912 IPv4 addresses
all-2022-09-04.log: 252 URLs 3,546,950 IPv4 addresses

53 previous URLs not found, 6 new URLs found = net -47 URLs. It appears due to multiple reasons. One of the reasons is that v1.5.1 is more strict about remark format. But another is unclear. I'm submitting it as a separate issue.