python / cpython

The Python programming language
https://www.python.org
Other
62.14k stars 29.86k forks source link

Extension for MIME type is not recognized #111637

Open frankenstein91 opened 10 months ago

frankenstein91 commented 10 months ago

Bug report

Bug description:

I currently have a problem with guessing file extensions from the MIME type, unfortunately. I understood that only files with a registry at IANA can be guessed and checked the linked page. As I understand, my affected data is registered with IANA.

The IANA link: https://www.iana.org/assignments/media-types/application/vnd.openxmlformats-officedocument.wordprocessingml.document

Python 3.11.5 (main, Sep  2 2023, 14:16:33) [GCC 13.2.1 20230801] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mimetypes
>>> print(mimetypes.guess_extension("vnd.openxmlformats-officedocument.wordprocessingml.document"))
None

I hoped to see .docx as a return value

CPython versions tested on:

3.11

Operating systems tested on:

Linux, Windows

zooba commented 10 months ago

The type is application/vnd.openxmlformats-officedocument.wordprocessingml.document (note the "application" at the start). Both the media type and the subtype are required.

frankenstein91 commented 10 months ago

Yes, I must have made a mistake when I copied the MIME type on my Linux system. But I still can't get any further on my Windows system. @zooba: Do you see my mistake here too? Would be very nice

Python 3.11.6 (tags/v3.11.6:8b6ee5b, Oct  2 2023, 14:57:12) [MSC v.1935 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import mimetypes
>>> print(mimetypes.guess_extension("application/vnd.openxmlformats-officedocument.wordprocessingml.document"))
None
>>>
zooba commented 10 months ago

Do you have Office installed?

Windows doesn't have an exhaustive database built-in, it allows applications to register their own MIME types when they are installed.

frankenstein91 commented 10 months ago

I do not have office installed... but looks like Windows knows it grafik

zooba commented 10 months ago

It may know a display name without having a MIME registration - those are separate things in Windows.

Your code works fine for me with Python 3.11 and Word installed, so I would assume it's simply not registered in your case. There's nothing we can do about that - send feedback to Microsoft through the Windows Feedback tool?

frankenstein91 commented 10 months ago

@zooba maybe Devs should than change https://docs.python.org/3/library/mimetypes.html from The optional strict argument is a flag specifying whether the list of known MIME types is limited to only the official types [registered with IANA](https://www.iana.org/assignments/media-types/media-types.xhtml). When strict is True (the default), only the IANA types are supported; when strict is False, some additional non-standard but commonly used MIME types are also recognized.

To something like Only file types known to the operating system are recognized.

I personally find it very confusing otherwise

zooba commented 10 months ago

You're more than welcome to create a pull request with the updates. You'll find the Docs/library/mimetypes.rst in this repository.

frankenstein91 commented 10 months ago

Unfortunately English is not my mother tongue, I would suggest someone from the USA or the UK can formulate this much better than I can. @patrickmccallum: sorry for linking you here... but do you have an idea how your database could be become a part of Python?