ahupp / python-magic

A python wrapper for libmagic
Other
2.6k stars 280 forks source link

Support file version 5.40 and above #240

Closed bitstreamout closed 3 years ago

bitstreamout commented 3 years ago

New file version 5.40 counts pages in PDF files:

abuild@noether:~/rpmbuild/BUILD/python-magic-0.4.18> file --version
file-5.40
magic file from /etc/magic:/usr/share/misc/magic
abuild@noether:~/rpmbuild/BUILD/python-magic-0.4.18> file test/testdata/test.pdf 
test/testdata/test.pdf: PDF document, version 1.2, 2 pages
ahupp commented 3 years ago

Are you requesting that this output gets parsed into some structured metadata?

bitstreamout commented 3 years ago

Are you requesting that this output gets parsed into some structured metadata?

I do not get this ... I've submitted the attached patch to our local build of (python-)python-magic package here fix-4-file-5.40.zip

ahupp commented 3 years ago

python-magic does not include a copy of libmagic, it requires that is provided by the system. So if you had libmagic 5.40 installed this new feature should just work.

bitstreamout commented 3 years ago

With libmagic 5.40 the python-magic test of its pdf test file simply fails and the patch in the attached zip does fix this. Compare with bug https://bugzilla.opensuse.org/show_bug.cgi?id=1184881 ... if you think this is not required than simply close this issue.

ahupp commented 3 years ago

Ah, you mean the tests. Thanks for letting me know!

pombredanne commented 3 years ago

@bitstreamout since your patch makes the test fails with earlier file versions, what about testing `startswith('PDF document, version 1.2') instead? @ahupp is there a way in your test framework?

ahupp commented 3 years ago

@pombredanne I fixed those tests to handle both outputs in 0.4.23