massimocandela / geofeed-finder

Utility to find geofeed files linked from rpsl.
BSD 3-Clause "New" or "Revised" License
74 stars 8 forks source link

ECONNRESET during RIR data downloads #27

Closed sid6mathur closed 1 year ago

sid6mathur commented 1 year ago

I frequently see ECONNRESET associated with whois downloads; this has been experienced across all versions, particularly on my home broadband (fiber, 400Mbps symmetric). They occur within a few seconds of a whois data download start. They are NOT observed when running the tool on an EC2 server in a cloud DC, so clearly, low latency backbones mitigate this issue. Can we catch these exceptions and print more insightful errors about which RIR download was affected?

Also, I suspect the cached file is marked as "successfully downloaded" in the above scenario, even if it's a partial file and not (re)downloaded on the tool's next run. This is an unsubstantiated hunch at this time, but the output CSV always yields fewer rows and no hard error on stderr or with the return value. And the next run always says all RIR files are being processed from cache.

Thank you! :)

~$ \rm -rf ~/.cache/ ~$ ~/bin/geofeed-finder-linux-x64 --keep-non-iso --keep-invalid-subdivisions --download-timeout 0 -o ~/auto-geofeed-latest.csv [ripe] Downloading whois data [afrinic] Downloading whois data [apnic] Downloading whois data [arin] Downloading stat file [lacnic-rir] Downloading whois data ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% | ETA: 0s | 0/80434Cannot retrieve 198.17.99.0 ECONNRESET ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% | ETA: 47215s | 1/80434Cannot retrieve 2620:1fc:: ECONNRESET ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% | ETA: 4835s | 14/80434Cannot retrieve 198.17.96.0 ECONNRESET ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% | ETA: 6720s | 15/80434Cannot retrieve 198.17.95.0 ECONNRESET ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% | ETA: 2440s | 26/80434[lacnic-rir] Parsing whois data: inetnum ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% | ETA: 2428s | 100/80434[lacnic-rir] Using cached whois data [lacnic-rir] Parsing whois data: inet6num ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% | ETA: 4499s | 101/80434[afrinic] Parsing whois data: inetnum,inet6num ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% | ETA: 2439s | 344/80434[apnic] Parsing whois data ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 1% | ETA: 2486s | 842/80434Cannot retrieve 2620:12e:d000:: ECONNRESET █░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 1% | ETA: 2384s | 1268/80434

massimocandela commented 1 year ago

Hi @s8mathur,

The ECONNRESET are only from ARIN rdap api. They are not possible with other RIRs (the application terminates), for which the api is not used. On average you do 80k+ queries to ARIN, so some tolerance is needed. This doesn't corrupt the cache.

In your logs I see 5 queries failed, which are higher than what I ever saw. However, the probability that these 5 queries on 80k+ queries affected an inetnum with a geofeed reference is ~0. Also, in the logs it says exactly which inetnum didn't manage to download (none of the ones in your log have geofeeds).

When the application starts the parsing of an RIR, the whois data was downloaded correctly. If the whois cache is corrupted, I report the file and terminate the application. Being the file corrupted (and a single compressed file), the next iteration will trigger the same error and terminate.

The only cases when an execution can produce less entries are:

I see you use download-timeout=0. This is the timeout to download geofeed files (not whois). If you set it to 0, the default timeout is applied (10 seconds).

Let me know if this clarifies the topic. If you find any data that can point to any issue, please share.