Open AidanWelch opened 2 months ago
what's included is based on WHATWG mime sniffing https://mimesniff.spec.whatwg.org/ this gives us a clear spec to adhere to, rather than an arbitrary list.
@seankhliao Wow, thanks for the quick response, but I'm confused as to where that actually specifies specifically just the mime types specified in builtinTypes. From my understanding that would be more relevant for net/http
's DetectContentType
that is actually sniffing. But, for mime
's ExtensionsByType
and TypeByExtension
don't we have the assumption that the file extension/type is truthful and we're trying to determine the most likely type from that- whereas sniffing wouldn't even care about the given type or extension? (And so sniffing would give most(all?) plaintext types for example the same extension/type)
Change https://go.dev/cl/614376 mentions this issue: mime: extend "builtinTypes" to include a more complete list of common types
Related Issues and Documentation
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)
what's included is based on WHATWG mime sniffing https://mimesniff.spec.whatwg.org/ this gives us a clear spec to adhere to, rather than an arbitrary list.
net/http.DetectContentType is based on WHATWG's spec; this proposal is for the type/extension mapping used by mime.TypeByExtension and other functions in the mime package when the system MIME database (/etc/mime.types or similar) isn't present.
Per conversation here https://github.com/whatwg/mimesniff/issues/51#issuecomment-2415555310, the intent of the Mimesniff spec is
"Based on the recent trajectory of changes to this spec, it seems to me that the scope of the spec is client-side sniffing for cross-browser compatibility and protection for the user against malicious files"
Mimesniff spec is not an appropriate spec for a http server use case. It would be better to adopt a different spec for this.
Alternatively, a new function that is server side appropriate that implements a different spec is needed. (EDIT: This comment was regarding DetectContentType, not TypeByExtension)
@milhoan But as of now, this doesn't mimesniff. It just maps file extensions to mime types
@milhoan But as of now, this doesn't mimesniff. It just maps file extensions to mime types
Sorry, I saw the discussion above about DetectContentType being based on that spec(imo it should not be). Disregard my comment as this is not about that function. I'm 100% in favor of more mime type coverage for TypeByExtension
Looking at what the browsers do for matching file extensions to mime type:
Chromium https://chromium.googlesource.com/chromium/src/+/master/net/base/mime_util.cc#129 Maintains a primary and secondary mapping, with the preference order being: primary, platform, secondary.
Firefox https://searchfox.org/mozilla-central/source/uriloader/exthandler/nsExternalHelperAppService.cpp#2968 list at https://searchfox.org/mozilla-central/source/uriloader/exthandler/nsExternalHelperAppService.cpp#455 const defs https://searchfox.org/mozilla-central/source/netwerk/mime/nsMimeTypes.h Maintains a default and extra mapping, with the preference order being: default, platform, extras.
Below is a table mapping file extensions to go mime types and chromium / firefox inclusion in primary (1) or secondary (2) lists, and their mime type if it differs from what go has.
extension | go mime type | chrome | firefox |
---|---|---|---|
3g2 | 2 (video/3gpp2) | ||
3gp | 2 (video/3gpp) | ||
3gpp | 2 (video/3gpp) | ||
aac | 2 (audio/aac) | ||
ai | 2 (application/postscript) | 2 (application/postscript) | |
apk | 2 (application/vnd.android.package-archive) | 2 (application/vnd.android.package-archive) | |
apng | 1 (image/apng) | 2 (image/apng) | |
appcache | 2 (text/cache-manifest) | ||
arj | 2 (application/x-arj) | ||
art | 2 (image/x-jg) | ||
avif | image/avif | 1 | 2 |
bin | 2 (application/octet-stream) | 2 (application/octet-stream) | |
bmp | 2 (image/bmp) | 2 (image/bmp) | |
cer | 2 (application/x-x509-ca-cert) | ||
com | 2 (application/octet-stream) | 2 (application/octet-stream) | |
crt | 2 (application/x-x509-ca-cert) | ||
crx | 1 (application/x-chrome-extension) | ||
css | text/css | 1 | 2 |
csv | 1 (text/csv) | 2 (text/csv) | |
cur | 2 (image/x-icon) | ||
doc | 2 (application/msword) | 2 (application/msword) | |
docx | 2 (application/vnd.openxmlformats-officedocument.wordprocessingml.document) | 2 (application/vnd.openxmlformats-officedocument.wordprocessingml.document) | |
dot | 2 (application/msword) | ||
ehtml | 2 (text/html) | 2 (text/html) | |
eml | 2 (message/rfc822) | 2 (message/rfc822) | |
eps | 2 (application/postscript) | 2 (application/postscript) | |
epub | 2 (application/epub+zip) | ||
exe | 2 (application/octet-stream) | 2 (application/octet-stream) | |
flac | 1 (audio/flac) | 2 (audio/flac) | |
ftl | 1 (text/plain) | ||
gif | image/gif | 1 | 2 |
gz | 2 (application/x-gzip) | 2 (application/gzip) | |
htm | text/html | 1 | 2 |
html | text/html | 1 | 2 |
ical | 2 (text/calendar) | ||
icalendar | 2 (text/calendar) | ||
ico | 2 (image/vnd.microsoft.icon) | 2 (image/x-icon) | |
ics | 2 (text/calendar) | 2 (text/calendar) | |
ifb | 2 (text/calendar) | ||
jfif | 2 (image/jpeg) | 2 (image/jpeg) | |
jpeg | image/jpeg | 1 | 2 |
jpg | image/jpeg | 1 | 2 |
js | text/javascript | 2 (application/javascript) | 2 (application/x-javascript) |
jsm | 2 (application/x-javascript) | ||
json | application/json | 2 | 2 |
jxl | 2 (image/jxl) | ||
locale | 1 (text/plain) | ||
m3u8 | 2 (application/x-mpegurl) | ||
m4a | 1 (audio/x-m4a) | 2 (audio/mp4) | |
m4b | 2 (audio/mp4) | ||
m4v | 1 (video/mp4) | ||
mht | 1 (multipart/related) | ||
mhtml | 1 (multipart/related) | ||
mid | 2 (audio/x-midi) | ||
mjs | text/javascript | 1 | 2 (application/x-javascript) |
mml | 2 (application/mathml+xml) | ||
mp2 | 2 (audio/mpeg) | ||
mp3 | 1 (audio/mp3) | 2 (audio/mpeg) | |
mp4 | 1 (video/mp4) | 2 (video/mp4) | |
mpeg | 2 (video/mpeg) | ||
mpega | 2 (audio/mpeg) | ||
mpg | 2 (video/mpeg) | ||
odg | 2 (application/vnd.oasis.opendocument.graphics) | ||
odp | 2 (application/vnd.oasis.opendocument.presentation) | ||
ods | 2 (application/vnd.oasis.opendocument.spreadsheet) | ||
odt | 2 (application/vnd.oasis.opendocument.text) | ||
oga | 1 (audio/ogg) | 2 (audio/ogg) | |
ogg | 1 (audio/ogg) | 2 (application/ogg) | |
ogm | 1 (video/ogg) | ||
ogv | 1 (video/ogg) | 2 (video/ogg) | |
opus | 1 (audio/ogg) | 2 (audio/ogg) | |
p7c | 2 (application/pkcs7-mime) | ||
p7m | 2 (application/pkcs7-mime) | ||
p7s | 2 (application/pkcs7-signature) | ||
p7z | 2 (application/pkcs7-mime) | ||
application/pdf | 2 | 2 | |
pjp | 2 (image/jpeg) | 2 (image/jpeg) | |
pjpeg | 2 (image/jpeg) | 2 (image/jpeg) | |
png | image/png | 2 (image/x-png) | 2 |
ppt | 2 (application/vnd.ms-powerpoint) | 2 (application/vnd.ms-powerpoint) | |
pptx | 2 (application/vnd.openxmlformats-officedocument.presentationml.presentation) | 2 (application/vnd.openxmlformats-officedocument.presentationml.presentation) | |
properties | 1 (text/plain) | ||
ps | 2 (application/postscript) | 2 (application/postscript) | |
rdf | 2 (application/rdf+xml) | 2 (application/rdf+xml) | |
rss | 2 (application/rss+xml) | ||
rtf | 2 (application/rtf) | 2 (application/rtf) | |
sh | 2 (text/x-sh) | ||
shtm | 1 (text/html) | ||
shtml | 1 (text/html) | 2 (text/html) | |
svg | image/svg+xml | 1 | 2 |
svgz | 1 (image/svg+xml) | ||
swf | 2 (application/x-shockwave-flash) | ||
swl | 2 (application/x-shockwave-flash) | ||
tar | 2 (application/x-tar) | ||
text | 2 (text/plain) | 2 (text/plain) | |
tgz | 2 (application/x-gzip) | ||
tif | 2 (image/tiff) | 2 (image/tiff) | |
tiff | 2 (image/tiff) | 2 (image/tiff) | |
txt | 2 (text/plain) | 2 (text/plain) | |
vcard | 2 (text/vcard) | ||
vcf | 2 (text/vcard) | ||
vtt | 2 (text/vtt) | 2 (text/vtt) | |
wasm | application/wasm | 1 | 2 |
wav | 1 (audio/wav) | 2 (audio/x-wav) | |
weba | 2 (audio/webm) | ||
webm | 1 (audio/webm) | 2 (audio/webm) | |
webp | image/webp | 1 | 2 |
woff | 2 (application/font-woff) | ||
xbl | 2 (text/xml) | 2 (text/xml) | |
xbm | 2 (image/x-xbitmap) | 2 (image/x-xbitmap) | |
xht | 1 (application/xhtml+xml) | 2 (application/xhtml+xml) | |
xhtm | 1 (application/xhtml+xml) | ||
xhtml | 1 (application/xhtml+xml) | 2 (application/xhtml+xml) | |
xls | 2 (application/vnd.ms-excel) | 2 (application/vnd.ms-excel) | |
xlsx | 2 (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet) | 2 (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet) | |
xml | text/xml | 1 | 2 |
xpi | 2 (application/x-xpinstall) | ||
xsl | 2 (text/xml) | 2 (text/xml) | |
xslt | 2 (text/xml) | ||
xul | 2 (application/vnd.mozilla.xul+xml) | ||
yuv | 2 (video/x-raw-yuv) | ||
zip | 2 (application/zip) | 2 (application/zip) |
If we are to add more, I propose we limit it to what both browsers have decided to include in their built in lists.
That sounds good to me, I can update the PR if that is what's decided on
Interestingly, the one case where we override the platform value (on Windows, we ignore a registry entry mapping .js
to text/plain
) is one where Chrome and Firefox apparently prefer the platform setting.
Limiting our list of builtin mappings to what both Chrome and Firefox include seems reasonably principled. I'd support that.
Proposal Details
Right now,
mime/type.go includes what seems to be a somewhat arbitrary list of built-in types:
I think some guidance on what should be included in this would be good, rather than a consumer of the package not realizing there are arbitrary gaps. In the meantime I will submit a PR that will incorporate all MDN defined "Common Types" (which also I have to admit is arbitrary, but at least covers more common usecases.)