chfoo / warcat

Tool and library for handling Web ARChive (WARC) files.
GNU General Public License v3.0
150 stars 21 forks source link

Support warnings when Content-Type doesn't match what cdx-writer expects #4

Closed chfoo closed 10 years ago

chfoo commented 10 years ago

Currently, cdx-writer expects Content-Type to be in the form application/http; msgtype=response. However, the WARC spec allows the form application/http;msgtype=response (notice there is no space). Warcat should warn when this is detected.

See Issue: internetarchive/CDX-Writer#4.

chfoo commented 10 years ago

Fixed in CDX-writer, so the issue is no longer relevant.