This allows HTTP parsing of the three examples in #25. Note that while parsing the HTTP message is successful the http_message_1.warc.gz example now throws an invalid WARC trailer exception as the value of the Content-Length WARC header seems to be one byte larger than the actual content block.
We may need to add additional leniency over time if more examples are found. This isn't intended to allow every random byte sequence to be interpreted as a HTTP message but rather to cope with real non-standard messages that were used in the wild and that would have worked in browsers at the time.
Thanks, @ato! I can confirm that the HTTP headers in all 3 examples are now parsed successfully. Yes, the Content-Length in one example is wrong (one superfluous line break).
This allows HTTP parsing of the three examples in #25. Note that while parsing the HTTP message is successful the http_message_1.warc.gz example now throws an invalid WARC trailer exception as the value of the Content-Length WARC header seems to be one byte larger than the actual content block.
We may need to add additional leniency over time if more examples are found. This isn't intended to allow every random byte sequence to be interpreted as a HTTP message but rather to cope with real non-standard messages that were used in the wild and that would have worked in browsers at the time.
Closes #25. CC @sebastian-nagel.