cdgriffith / puremagic

Pure python implementation of identifying files based off their magic numbers
MIT License
158 stars 34 forks source link

For Python 3.13: A drop-in replacement for `imghdr.what()` #72

Closed cclauss closed 2 months ago

cclauss commented 3 months ago

Given the discussion in #67 about imghdr being removed from the Python Standard Library, it might be quite helpful to have a drop-in replacement for imghdr.what(). It would provide a smooth transition to Py3.13 if developers could confidently replace all instances of imghdr.what() with puremagic.what() -- same args, same results.

NebularNerd commented 3 months ago

Oddly I was thinking of this the other day, I've not used imghdr does it literally just return an extension?

cclauss commented 3 months ago

% python3.12 -c"import imghdr ; print(imghdr.what(None, b'\xff\xd8\xff\xdb'))"

<string>:1: DeprecationWarning: 'imghdr' is deprecated and slated for removal in Python 3.13
jpeg

Source code: https://github.com/python/cpython/blob/3.12/Lib/imghdr.py

% python3.12

>>> import imghdr
<stdin>:1: DeprecationWarning: 'imghdr' is deprecated and slated for removal in Python 3.13
>>> ", ".join(sorted(test_func.__name__[5:] for test_func in imghdr.tests))
'bmp, exr, gif, jpeg, pbm, pgm, png, ppm, rast, rgb, tiff, webp, xbm'
NebularNerd commented 3 months ago

That's super basic, at a guess we could go with a very basic shim:

import puremagic as imghdr

Add a def for .what and just return the highest confidence extension minus the . from the database. That would provide a very basic drop-in replacment with the added power of the larger database. Folks could then transition to a proper PureMagic implementation.

cdgriffith commented 2 months ago

Released in https://github.com/cdgriffith/puremagic/releases/tag/1.24 thanks!