samuelneff / MimeTypeMap

Provides a huge dictionary of file extensions to mime types.
MIT License
622 stars 201 forks source link

Official mimetypes vs. extension mapping for m4b files #110

Closed sandreas closed 3 years ago

sandreas commented 3 years ago

Hello,

thank you for providing this useful library. Since I just ran into an issue using m4b files, I would like to ask, which sources you used for the mapping?

The result for m4b files is audio/m4b, but AFAIK this type does not exist / is not specified. The "official" list of registered Mimetypes linked to RFCs / Persons is available at:

https://www.iana.org/assignments/media-types/media-types.xhtml

These are without extension mapping, since the extension of a file can only be a hint for the real mimetype, although it maybe a good hint ;-)

The best resource I found WITH extension mapping was: http://svn.apache.org/repos/asf/httpd/httpd/trunk/docs/conf/mime.types which also lacks m4b :-/

Another good resource might be the magic.mgc which is common on Unix: https://man7.org/linux/man-pages/man4/magic.4.html

Searching for m4b in the types definition, it is referenced as audio/x-m4b.

# grep 'm4b' /usr/share/mime/types
audio/x-m4b
# file -b --mime-type sample.m4b
audio/x-m4b

So it seems that in this library / type definition every custom mimetype, that is not referenced in the official listing, is prefixed with x-.

The magic.mgc contains a binary analysis instruction set, that does not rely on the extension, but this may also be incorrect (or at least unwanted), since the result for some m4b files is video/mp4, which is technically correct but assuming the extension is valid, the wanted result is audio/mp4 or better audio/x-m4b, even if the file contains the binary signature of a video.

To get to the bottom line:

samuelneff commented 3 years ago

@sandreas Thank you for the detailed analysis. I looked into the history of how m4b got added to the list and did some additional research online. My conclusion is the same as yours, it should be audio/x-m4b.

If you would be so kind as to submit a PR, I'll merge it.

Thanks again,

Sam

sandreas commented 3 years ago

Ok thx for the quick response. I'll take some time to check other mimetypes by the mentioned resources and submit a pull request asap.

sandreas commented 3 years ago

There are the following other possible wrong /x- mappings:

.jar - actual: application/java-archive expected: application/x-java-archive
.aiff - actual: audio/aiff expected: audio/x-aiff
.m4b - actual: audio/m4b expected: audio/x-m4b
.pls - actual: audio/scpls expected: audio/x-scpls
.wav - actual: audio/wav expected: audio/x-wav
.pic - actual: image/pict expected: image/x-pict

I will submit a PR with all these corrected.

After further investigation i found out that about 700 items in /usr/share/mime/types have no mapping. Some examples:

text/xmcd
video/mp2t
video/vnd.mpegurl

The list is too long to correct all of them, but I'll try to do some of them. I'll create a second pull request, if I find some other improvements.

sandreas commented 3 years ago

Did not have the time to fix this, sorry.