Open parismic opened 2 years ago
It's a pretty straight-forward thing to look at if you sent us a link to an actual warc that has this problem.
It's a pretty straight-forward thing to look at if you sent us a link to an actual warc that has this problem.
I tried to open a few WARCs using WebRecorder Player and got this exact error message, I don't know if they were created via ArchiveSpark but maybe it can be useful to solve the problem. They can be found here.
warcio raises
warcio.exceptions.ArchiveLoadFailed: Invalid WARC record, first line: WARC-Type: response
at the second WARC record in a WARC file written with ArchiveSpark Both state that they use ISO http://bibnum.bnf.fr/WARC/WARC_ISO_28500_version1_latestdraft.pdfwarcio also returns a warning before the error:
It could be that ArchiveSpark should write an additional empty line between the records or warcio is not in line with the ISO.
warcio.statusandheaders.StatusAndHeadersParserException: Expected Status Line starting with ['WARC/1.1', 'WARC/1.0', 'WARC/0.17', 'WARC/0.18'] - Found: WARC-Type: response
I'll post this issue on ArchiveSpark as well. Does anyone know more?