weecology / retriever

Quickly download, clean up, and install public datasets into a database management system
http://data-retriever.org
Other
305 stars 133 forks source link

Installation from source fails due to missing configuration #1671

Open dikwickley opened 8 months ago

dikwickley commented 8 months ago

Python version: 3.11.6 When building from source by following steps mentioned in README (https://github.com/weecology/retriever#to-install-from-source)

// after cloning and running
pip install . -U

Trying to run retriever ls gives the following error

(env) aniket@air retriever % retriever ls    
Traceback (most recent call last):
  File "/Users/aniket/Projects/gsoc/retriever/env/bin/retriever", line 5, in <module>
    from retriever.__main__ import main
  File "/Users/aniket/Projects/gsoc/retriever/env/lib/python3.11/site-packages/retriever/__init__.py", line 7, in <module>
    from retriever.lib.engine_tools import set_proxy, create_home_dir
  File "/Users/aniket/Projects/gsoc/retriever/env/lib/python3.11/site-packages/retriever/lib/__init__.py", line 4, in <module>
    from .datasets import datasets
  File "/Users/aniket/Projects/gsoc/retriever/env/lib/python3.11/site-packages/retriever/lib/datasets.py", line 1, in <module>
    from retriever.lib.scripts import SCRIPT_LIST, get_script, get_dataset_names_upstream
  File "/Users/aniket/Projects/gsoc/retriever/env/lib/python3.11/site-packages/retriever/lib/scripts.py", line 506, in <module>
    global_script_list = StoredScripts()
                         ^^^^^^^^^^^^^^^
  File "/Users/aniket/Projects/gsoc/retriever/env/lib/python3.11/site-packages/retriever/lib/scripts.py", line 495, in __init__
    self._shared_scripts = SCRIPT_LIST()
                           ^^^^^^^^^^^^^
  File "/Users/aniket/Projects/gsoc/retriever/env/lib/python3.11/site-packages/retriever/lib/scripts.py", line 115, in SCRIPT_LIST
    return reload_scripts()
           ^^^^^^^^^^^^^^^^
  File "/Users/aniket/Projects/gsoc/retriever/env/lib/python3.11/site-packages/retriever/lib/scripts.py", line 59, in reload_scripts
    if not check_retriever_minimum_version(read_script):
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aniket/Projects/gsoc/retriever/env/lib/python3.11/site-packages/retriever/lib/scripts.py", line 32, in check_retriever_minimum_version
    if not parse_version(VERSION) >= parse_version("{}".format(mod_ver)):
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aniket/Projects/gsoc/retriever/env/lib/python3.11/site-packages/pkg_resources/_vendor/packaging/version.py", line 198, in __init__
    raise InvalidVersion(f"Invalid version: '{version}'")
pkg_resources.extern.packaging.version.InvalidVersion: Invalid version: ''

This is happening because a few json files under /scripts are missing the "retriever_minimum_version" property.

dikwickley commented 8 months ago

just checked, this is an issue with Python 3.9 as well

(env3.9) aniket@air retriever % retriever ls                            
Traceback (most recent call last):
  File "/Users/aniket/Projects/gsoc/retriever/env3.9/bin/retriever", line 5, in <module>
    from retriever.__main__ import main
  File "/Users/aniket/Projects/gsoc/retriever/env3.9/lib/python3.9/site-packages/retriever/__init__.py", line 7, in <module>
    from retriever.lib.engine_tools import set_proxy, create_home_dir
  File "/Users/aniket/Projects/gsoc/retriever/env3.9/lib/python3.9/site-packages/retriever/lib/__init__.py", line 4, in <module>
    from .datasets import datasets
  File "/Users/aniket/Projects/gsoc/retriever/env3.9/lib/python3.9/site-packages/retriever/lib/datasets.py", line 1, in <module>
    from retriever.lib.scripts import SCRIPT_LIST, get_script, get_dataset_names_upstream
  File "/Users/aniket/Projects/gsoc/retriever/env3.9/lib/python3.9/site-packages/retriever/lib/scripts.py", line 506, in <module>
    global_script_list = StoredScripts()
  File "/Users/aniket/Projects/gsoc/retriever/env3.9/lib/python3.9/site-packages/retriever/lib/scripts.py", line 495, in __init__
    self._shared_scripts = SCRIPT_LIST()
  File "/Users/aniket/Projects/gsoc/retriever/env3.9/lib/python3.9/site-packages/retriever/lib/scripts.py", line 115, in SCRIPT_LIST
    return reload_scripts()
  File "/Users/aniket/Projects/gsoc/retriever/env3.9/lib/python3.9/site-packages/retriever/lib/scripts.py", line 59, in reload_scripts
    if not check_retriever_minimum_version(read_script):
  File "/Users/aniket/Projects/gsoc/retriever/env3.9/lib/python3.9/site-packages/retriever/lib/scripts.py", line 32, in check_retriever_minimum_version
    if not parse_version(VERSION) >= parse_version("{}".format(mod_ver)):
  File "/Users/aniket/Projects/gsoc/retriever/env3.9/lib/python3.9/site-packages/pkg_resources/_vendor/packaging/version.py", line 198, in __init__
    raise InvalidVersion(f"Invalid version: '{version}'")
pkg_resources.extern.packaging.version.InvalidVersion: Invalid version: ''
dikwickley commented 8 months ago

Cause of this issue: In the Script class in templates retriever_minimum_version is a default argument with value ""

https://github.com/weecology/retriever/blob/37982577eca010a03dd5b5e23fe30be8f42da9ed/retriever/lib/templates.py#L30

When checking for version in retriever/lib/scripts.py https://github.com/weecology/retriever/blob/37982577eca010a03dd5b5e23fe30be8f42da9ed/retriever/lib/scripts.py#L25-L37

It was supposed to fall out of the if hasattr(..) condition but due to the default argument, it always checks for it. and it fails when parsing the version (as the version is "").

dikwickley commented 8 months ago

As to why this is suddenly breaking is because of setuptools version. The version of setuptools is not fixed in requirements.txt and I was able to get it working by downgrading the version of setuptools to 65 (found that from this comment here). This was a change in setuptools itself.

This still needs to be fixed. We can

  1. Fix the version of setup tools (simplest)
  2. Fix the underlying logic that I mentioned in the above comment.

@henrykironde what do you think?

henrykironde commented 8 months ago

@dikwickley I couldn't reproduce this. Feel free to put in a PR; it will probably give me more insight into the issue.

dikwickley commented 8 months ago

@henrykironde the issue arises due to setuptools version, make sure your set up tools version is above 66 and that version is being used (not your global one which is probably lower than 65)

ethanwhite commented 8 months ago

Just confirming that this is also an issue with Python 3.11 + setuptools 68.0.0 and we're broken in Python 3.12 for other reasons (#1673).

@henrykironde getting this fixed should be a priority