wget quirk: Content-Length off by one

Some versions of wget generated WARC headers with an off by one Content-Length. This causes us to throw:

org.netpreserve.jwarc.ParsingException: invalid WARC trailer: a0d0a57

Examples:

http_message_1.warc.gz Wget/1.19.4 (from #25)
http_chunked_2.warc.gz Wget/1.19.4 (from #24)

Other implementations appear to ignore this error. Perhaps by simply skipping arbitrary numbers of CR and LF characters before reading the next record?

I don't want to silently ignore this but perhaps we could log a warning and attempt to continue.

iipc / jwarc

wget quirk: Content-Length off by one #29