Closed sr-verde closed 1 year ago
I'm not familiar with that libmagic feature, can you given an example of how it's used?
Of course, so let's have a look at an example for different type of files. At first, an example with two different files without MAGIC_CONTINUE
, then an example with MAGIC_CONTINUE
.
Python 3.7.6 (default, Jan 6 2020, 15:14:19)
[GCC 9.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from magic import Magic
>>>
>>> m = Magic(mime=True)
>>> m.from_file('/tmp/archive.tar.gz')
'application/gzip'
>>> m.from_file('/tmp/launchKVMJava.do')
'text/html'
>>>
>>> m = Magic(mime=True, keep_going=True)
>>> m.from_file('/tmp/archive.tar.gz')
'application/gzip\\012- application/octet-stream'
>>> m.from_file('/tmp/launchKVMJava.do')
'text/html\\012- text/plain'
With active MAGIC_CONTINUE
flag, you'll all matches, not just the first. \\012
indicatets a line break, all following mimetypes have a -
(indicating a list?).
What one could do, is to return an array instead of this creepy string.
From an API perspective, changing the return type of from_[file,buffer] depending on the CONTINUE flag would be confusing.
One thing I've considered is moving to a structured return value similar to how libmagic's wrapper does it, which could be more easily extended to include char encoding, mime type, text description etc all at the same time. Then this sort of "and other types" feature could be more easily tacked on as an optional list.
This would be a pretty substantial API change for a niche case, would prefer to not do it.
I want to get all mimetypes of a file using the
MAGIC_CONTINUE
flag. I would prefer an array instead of a string that needs to be parsed first. Is it possible to retrieve all mimetypes as array?If not, would it be within the scope of this project to implement a possibility that fulfill my requirements? I would spend some time to implement it.