iipc / jwarc

Java library for reading and writing WARC files with a typed API
Apache License 2.0
46 stars 8 forks source link

Gzip compression #53

Closed alex73 closed 3 years ago

alex73 commented 3 years ago

Does gzip compressed WARC support append ? I tried to write to new WARC(with gzip compression), then append it, then read. It says: Caused by: java.util.zip.ZipException: not in gzip format (magic=4157) at org.netpreserve.jwarc.GunzipChannel.readHeader(GunzipChannel.java:109) at org.netpreserve.jwarc.GunzipChannel.read(GunzipChannel.java:45) at org.netpreserve.jwarc.WarcParser.parse(WarcParser.java:306) at org.netpreserve.jwarc.WarcReader.next(WarcReader.java:151) at org.netpreserve.jwarc.WarcReader$1.hasNext(WarcReader.java:241)

Shouldn't each record be gzipped separatelly ?

alex73 commented 3 years ago

Sorry, my mistake