zooniverse / panoptes-python-client

Apache License 2.0
30 stars 26 forks source link

libmagic installation on Windows #264

Open PmasonFF opened 2 years ago

PmasonFF commented 2 years ago

This issue has been raised (and ignored) before (open issue #215) but the latest release version 1.4, does not even allow the previous workaround ( installing python-magic-bin or in some cases python-magic-win64). Currently a standard pip install on a windows machine results in the generation of a libmagic warning :

Broken libmagic installation detected. The python-magic module is installed but can't be imported. Please check that both python-magic and the libmagic shared library are installed correctly. Uploading media other than images may not work.

Upload of any video or audio files fails.

Currently I am running version 1.3 with the package python-magic replaced with python-magic-bin which is fully functional up to the 1.3 version limitations. If I try this workaround in version 1.4 I get:

Traceback (most recent call last):
  File "C:/py/Panoptes_client/work_with_Client_link_subjects.py", line 3, in <module>
    import panoptes_client
  File "C:\Users\User\AppData\Roaming\Python\Python38\site-packages\panoptes_client\__init__.py", line 1, in <module>
    from panoptes_client.classification import Classification
  File "C:\Users\User\AppData\Roaming\Python\Python38\site-packages\panoptes_client\classification.py", line 3, in <module>
    from panoptes_client.panoptes import LinkResolver, PanoptesObject
  File "C:\Users\User\AppData\Roaming\Python\Python38\site-packages\panoptes_client\panoptes.py", line 25, in <module>
    class Panoptes(object):
  File "C:\Users\User\AppData\Roaming\Python\Python38\site-packages\panoptes_client\panoptes.py", line 58, in Panoptes
    'User-Agent': 'panoptes-python-client/version=' + pkg_resources.require('panoptes_client')[0].version
  File "C:\Users\User\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pkg_resources\__init__.py", line 886, in require
    needed = self.resolve(parse_requirements(requirements))
  File "C:\Users\User\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pkg_resources\__init__.py", line 772, in resolve
    raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'python-magic<0.5,>=0.4' distribution was not found and is required by panoptes-client

Process finished with exit code 1

From my experience most of the smaller teams on zooniverse are using Windows. Even those on Mac using a 64 bit Python may need to run python-magic-bin, but I am not certain of this. The last team I set up on Mac was Woodpecker Cavity Cams; they were uploading video and the installation was complicated by requiring ffpmeg for file conversion, and I can not recall what we had to do to get panoptes_client working.

Thanks! Peter

adammcmaster commented 2 years ago

@PmasonFF Did you actually remove the python-magic package at some point? If so I think you need to still have it installed, even if you then install python-magic-bin on top of it. As far as pkg_resources is concerned they're separate packages (it doesn't know one is a fork of the other), and panoptes_client specifically requires python-magic.

adammcmaster commented 2 years ago

From testing it here, I think you'll need to do the following to get it all installed in the right order:

pip install python-magic
pip uninstall python-magic-bin
pip install python-magic-bin
PmasonFF commented 2 years ago

For version 1.3 and earlier one could install panopotes_client, which would bring in python-magic automatically, then depending on the Windows version of Python in use, either install python-magic-bin (64 bit Python, 64 bit processor) , or python-magic-win64, (32 bit Python, 64 bit machine) then uninstall python-magic, and end up with a fully functional panoptes client installation (up to the version of the client installed). Since version 1.4 that no longer works.
Workarounds that appear to work:

Windows 10, 64 bit, Python 3.9 64 bit

Install panoptes_client as normal, then install python-magic-bin, do not uninstall python-magic. Python will use the latest installed version of libmagic (from python-magic-bin ) but the presence of python-magic will satisfy the panoptes_client requirements list and the installation will function.

Windows 10 64 bit Python 3.8 32 bit

Install panoptes_client as normal, then install python-magic-bin, do not uninstall python-magic. The 32 bit Python no longer seems to require python-magic-win64

PmasonFF commented 2 years ago

If someone on a Mac and linux machine could verify, I believe python-magic-bin also works on those machines, and if so perhaps panoptes_client can require python-magic-bin and avoid the need for workarounds for Windows???

adammcmaster commented 2 years ago

Unfortunately it doesn't look like python-magic-bin is being maintained, so it might not be something we could rely on by default. It's been almost four years since it was last updated, and over a year since any activity on their GitHub repository.

PmasonFF commented 2 years ago

Unfortunately it doesn't look like python-magic-bin is being maintained, so it might not be something we could rely on by default.

Well at this point I have no other workarounds for Windows users. My experience is primarily with those zooniverse project teams that have little IT support, but for that group the majority are on Windows. Some teams (eg Cornell Bird Labs) believed panoptes client can not be used on Windows! Certainly the status quo should not remain. The resource listing should be platform specific and require packages that result in a functional installation. At the very least the error message should include instructions for a fix for the problem for Windows machines.

adammcmaster commented 2 years ago

It looks like it's possible to specify platform-specific dependencies: https://setuptools.readthedocs.io/en/latest/userguide/dependency_management.html#platform-specific-dependencies So we could only install python-magic-bin on Windows (and hopefully there aren't any bad effects from it being so outdated).

Or perhaps a better option would be to fall back to the mimetypes module instead of imghdr in subject.py if libmagic is broken.

PmasonFF commented 2 years ago

Since the whole point of the mime type identification is to classify the locations files/urls into a one of a very few file types which are actually supported in the panoptes API, it is not necessary to deal with any but a few very common mime types. It is not necessary that the module deal with any of the rare file types that make guessing mime type difficult - the built in mimetypes module should be quite sufficient.

PmasonFF commented 2 years ago

It was relatively straight forward to rewrite subject.py to use the builtin package mimetypes, at least for local files. But that approach will work with open file like objects, and I did not do much testing in Python 2 other than to test mimetypes is supported.

imghdr can be used for open file like objects if they are images, but that would leave file like objects of other types unsupported.

https://github.com/PmasonFF/Zooniverse-data-digging/blob/master/Panoptes_client_examples/subject.py

adammcmaster commented 2 years ago

I suspect we’d want to implement it so that it uses the best out of all three depending on the situation — first use python-magic if it’s available, then fall back to imghdr, and finally try mimetypes as a last resort (I’ve seen cases where people have files named with the wrong extension, so I would want to try imghdr before mimetypes to correctly handle those cases).

Since I’m not part of the Zooniverse team any more I don’t think I’ll have time to implement this myself, but I’d be happy to help review a PR if anyone else has time to do it.

lcjohnso commented 1 day ago

I wanted to add an update regarding libmagic installation. A quick summary:

There's an ongoing discussion on the python-magic repo regarding the potential for bundling libraries (particularly for Windows) as part of the main python-magic package. I will keep a close eye on how that conversation pans out, and pursue improvements regarding libmagic installation depending on the outcome. Possible paths forward include:

  1. relying on bundled libraries within python-magic (best solution, but not clear this change will be accepted and bundling might be a Windows-only solution)
  2. adding python-magic-bin install by default as part of Client install on Windows (in case where binaries are not bundled)
  3. improving printed warnings and docs re: secondary install instructions (will do in any case!)