Open dd8 opened 6 years ago
When I try these signatures in browsers they end up downloading the resource, while not identifying the type (as far as I can tell). This to me suggests these are treated as "binary", application/octet-stream, and this part of the standard isn't implemented.
It was included as part of https://tools.ietf.org/html/draft-ietf-websec-mime-sniff-03 (with the same error), so maybe my analysis is wrong somehow though.
Nothing in https://searchfox.org/mozilla-central/source/netwerk/streamconv/converters/nsUnknownDecoder.cpp or the variables used there lead me to Firefox implementing this. Perhaps it should though.
I do think this is probably worth fixing, including adding this in browsers. It'll result in more accurate downloads and will allow us to block more things in CORB easily (though that might require some restructuring).
Chromium has been using the string "Rar!\x1A\x07\x00"
since its initial commit in 2008, so I think this may have been a mistake in converting to bytes for the original draft of the spec:
https://chromium.googlesource.com/chromium/src.git/+/refs/heads/main/net/base/mime_sniffer.cc
Firefox does not even define a constant for the RAR MIME type, let alone handle RAR files: https://hg.mozilla.org/mozilla-central/file/tip/netwerk/mime/nsMimeTypes.h https://hg.mozilla.org/mozilla-central/file/tip/uriloader/exthandler/nsExternalHelperAppService.cpp https://hg.mozilla.org/mozilla-central/file/tip/toolkit/components/reputationservice/ApplicationReputation.cpp
The 4th byte of the RAR signature is wrong - it should be 21 and not 20.
Additionally, the last byte is not always zero - it's a version number 00 (RAR 1.5 to 4.0) or 01 (RAR 5) with values 02 through 05 reserved for future versions.
The pattern for RAR files in the spec is:
52 61 72 20 1A 07 00
The patterns we've seen from a random selection of downloaded RAR files are:
52 61 72 21 1A 07 00 cf 90 73 00 00 0d 00 00 00 52 61 72 21 1A 07 00 3b d0 73 08 00 0d 00 00 00
Wikipedia and RARlabs show the signatures as
52 61 72 21 1A 07 00 (RAR 1.5 to 4.0) 52 61 72 21 1A 07 01 00 (RAR 5+)
https://www.rarlab.com/technote.htm#rarsign https://en.wikipedia.org/wiki/RAR_(file_format)
Here's a mirror of the UnRAR source from rarlab.com on Github. The IsSignature method shows how last byte in the signature is used as a version number: https://github.com/pmachapman/unrar/blob/master/archive.cpp#LC99
(original source code is at https://www.rarlab.com/rar_add.htm)
Note: there's an older RAR 1.4 format (RARFMT14) with a very different signature, but that was replaced by RAR 1.5 in the 1990s, so RAR 1.4 is unlikely to be seen in the wild.