Open JustAnotherArchivist opened 5 years ago
Not sure what would be a better option here.. It is a fallback if no other Content-Type is specified and/or its a non-standard record. application/warc-fields
is for the warcinfo
style fields, which this is not. and application/warc
makes sense for the content-type for the WARC itself, but not for the payload of the record.. I suppose it could be application/octet-stream
but that would imply that its binary.
The Content-Type
header is optional, so omitting it would be one option. application/octet-stream
also seems sensible to me. WARC is a byte-oriented file format, so any payload must also be a collection of bytes. While the underlying data could be bit-based, it must be padded to bytes, which makes the container an octet-stream
again. The WARC specification also mentions:
If the media type remains unknown, the reader should treat it as type “application/octet-stream”.
Personally, I think omitting the header would be the best option.
warcio uses a default
Content-Type
value for WARC records ofapplication/warc-record
. This MIME type is not documented or specified anywhere; the WARC spec only mentionsapplication/warc
as the MIME type for WARC files andapplication/warc-fields
for warcinfo and metadata records (though it is ambiguous on whether that is required or recommended).