Open nickrsan opened 7 years ago
I have a giant WARC file, but wpull crashes when rewriting the URLs in the mirrored copy. Is there a way to extract this WARC into browsable files with corrected links (like wget's --convert-links
)?
Here it is, please be gentle to my server: https://mirrors.asun.co/climate-mirror/data.globalchange.gov-broken/data.globalchange.warc.gz
Name: Global Change Information System Organization: USGCRP Description URL: http://data.globalchange.gov Download URL: File Types: JSON Size: Status: CLAIMED 2016-12-15