iipc / jwarc

Java library for reading and writing WARC files with a typed API
Apache License 2.0
47 stars 8 forks source link

Payload body has size 0 if HTTP Content-Length header is missing #36

Closed sebastian-nagel closed 4 years ago

sebastian-nagel commented 4 years ago

If there is no HTTP Content-Length header a LengthedBody with size 0 is created. Reproducible when parsing the response record in http_no_content_length_1.warc.gz.

Reason: LengthedBody.discardPushbackOnRead() does not return an instance of LengthedBody but of an anonymous class inside LengthedBody (eg. LengthedBody$1) and the check instanceof LengthedBody fails.