whatwg / mimesniff

MIME Sniffing Standard
https://mimesniff.spec.whatwg.org/
Other
109 stars 44 forks source link

Signature for RAR is wrong #63

Open dd8 opened 6 years ago

dd8 commented 6 years ago

The 4th byte of the RAR signature is wrong - it should be 21 and not 20.

Additionally, the last byte is not always zero - it's a version number 00 (RAR 1.5 to 4.0) or 01 (RAR 5) with values 02 through 05 reserved for future versions.

The pattern for RAR files in the spec is:

52 61 72 20 1A 07 00

The patterns we've seen from a random selection of downloaded RAR files are:

52 61 72 21 1A 07 00 cf 90 73 00 00 0d 00 00 00 52 61 72 21 1A 07 00 3b d0 73 08 00 0d 00 00 00

Wikipedia and RARlabs show the signatures as

52 61 72 21 1A 07 00 (RAR 1.5 to 4.0) 52 61 72 21 1A 07 01 00 (RAR 5+)

https://www.rarlab.com/technote.htm#rarsign https://en.wikipedia.org/wiki/RAR_(file_format)

Here's a mirror of the UnRAR source from rarlab.com on Github. The IsSignature method shows how last byte in the signature is used as a version number: https://github.com/pmachapman/unrar/blob/master/archive.cpp#LC99

(original source code is at https://www.rarlab.com/rar_add.htm)

Note: there's an older RAR 1.4 format (RARFMT14) with a very different signature, but that was replaced by RAR 1.5 in the 1990s, so RAR 1.4 is unlikely to be seen in the wild.

annevk commented 6 years ago

When I try these signatures in browsers they end up downloading the resource, while not identifying the type (as far as I can tell). This to me suggests these are treated as "binary", application/octet-stream, and this part of the standard isn't implemented.

It was included as part of https://tools.ietf.org/html/draft-ietf-websec-mime-sniff-03 (with the same error), so maybe my analysis is wrong somehow though.

annevk commented 6 years ago

Nothing in https://searchfox.org/mozilla-central/source/netwerk/streamconv/converters/nsUnknownDecoder.cpp or the variables used there lead me to Firefox implementing this. Perhaps it should though.

annevk commented 6 years ago

I do think this is probably worth fixing, including adding this in browsers. It'll result in more accurate downloads and will allow us to block more things in CORB easily (though that might require some restructuring).

GPHemsley commented 3 years ago

Chromium has been using the string "Rar!\x1A\x07\x00" since its initial commit in 2008, so I think this may have been a mistake in converting to bytes for the original draft of the spec: https://chromium.googlesource.com/chromium/src.git/+/refs/heads/main/net/base/mime_sniffer.cc

Firefox does not even define a constant for the RAR MIME type, let alone handle RAR files: https://hg.mozilla.org/mozilla-central/file/tip/netwerk/mime/nsMimeTypes.h https://hg.mozilla.org/mozilla-central/file/tip/uriloader/exthandler/nsExternalHelperAppService.cpp https://hg.mozilla.org/mozilla-central/file/tip/toolkit/components/reputationservice/ApplicationReputation.cpp