Closed SichangHe closed 1 year ago
Interestingly, most of the DB are encoded in either UTF-8 or ASCII, but some use a latin-1 variant. And APNIC, being Asian, uses GBK.
I have implemented encoding detection so that all decoding should be correct.
Updated log after always using the correct encoding. parse_all_log.txt
After tweaking and fixing many errors, we have this log from parsing all IRR data in lowest verbosity: parse_all_log.txt
The input is 6.9G, the output 332M.