Open seanmonstar opened 5 years ago
cc @nox @SimonSapin @rustonaut
https://mimesniff.spec.whatwg.org/ is called "MIME Sniffing" and contains a parse a MIME type algorithm that is relevant.
But "sniffing" refers to looking at the contents of a file or the body of an HTTP response (in addition to other signals) to make a guess at the actual file format, in case the Content-Type
header is missing or unspecific or inaccurate. For example, if the first 6 bytes of a file are GIF89a
in ASCII it’s very probably a GIF, especially if it’s used in <img>
. That spec also has algorithms for this.
This kind of sniffing can be useful, but I don’t know if it should be in scope for this crate.
Sorry, I don't mean sniffing the body bytes, just using the parse algorithm mentioned in that document.
So, looking through the test cases, I noticed this as a valid MIME type:
!#$%&'+-.^_`|~0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz/!#$%&'+-.^`|~0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz;!#$%&'*+-.^ `|~0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz=!#$%&'*+-.^_`|~0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
Something I appreciate in the API in mime/master is the difference between MediaType
and MediaRange
. They allow things like text/*
to be a MediaRange
, but not MediaType
. That combined with headers::ContentType
would help prevent setting a frankly bogus content-type header (even though mimesniff says to parse it).
So I'm torn.
After some more thought, the advantages of just following what the Fetch spec wants outweighs having MediaType
and MediaRange
splits.
So, the new plan is to remove the split, only having Mime
again, and only supporting the mimesniff parsing algorithm.
The closest it is to the mimesniff algorithm, the more we can make use of it.
What would be useful too is a way to represent just the essence of a mime type, because many specs have prose about that.
Hi,
Is there a way to expose the both parsers (rfc and mime-sniff)? Actually i'd like to make some servo tests pass, so i need to follow the mime-sniff algo. @SimonSapin already has implemented it in rust-url (but not officially exposed by the crate). Should i duplicate the code in servo or can i help here?
Regards
The target domain of the
mime
crate is webdev. Instead of following the original RFCs (as is done now), perhaps it's best to just use the sniffing algorithm that is now used by web browsers.