ssec / sift

A visualization application for satellite imagery
http://sift.ssec.wisc.edu/
GNU General Public License v3.0
46 stars 13 forks source link

Fix setting of external Satpy component configuration #381

Closed nedelceo closed 10 months ago

nedelceo commented 11 months ago

Setting external Satpy component configuration is done by setting satpy_extra_readers_import_path which is missleading as described in #380

This PR:

  1. renames setting satpy_extra_readers_import_path to satpy_extra_config_path
  2. allows setting of satpy config path via env variable SATPY_CONFIG_PATH
nedelceo commented 11 months ago

I am not sure how to test changes. Should I create similar to test_satpy_import.py

nedelceo commented 11 months ago

pre-commit.ci autofix

djhoese commented 11 months ago

About tests...it may be hard to test this since everything happens at import time. If we moved the main part of the logic into a function you could probably run the function in your test and make sure it changes the satpy config with different sift config values.

nedelceo commented 11 months ago

pre-commit.ci autofix

djhoese commented 11 months ago

Looks like there is one relevant test failure.

 uwsift/tests/test_satpy_import.py:115: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

xdg_config_home = '/home/runner/.cache/SIFT/workspace/temp/pytest-of-runner/pytest-0/test_config_satpy_import_path0/config'
base_config_dir = '/home/runner/.cache/SIFT/workspace/temp/pytest-of-runner/pytest-0/test_config_satpy_import_path0/config/SIFT/config'
overwritten_satpy_import_path = '/home/runner/.cache/SIFT/workspace/temp/pytest-of-runner/pytest-0/test_config_satpy_import_path0/satpy'
satpy_init_path = '/home/runner/.cache/SIFT/workspace/temp/pytest-of-runner/pytest-0/test_config_satpy_import_path0/satpy/__init__.py'

    def check_uwsift_paths(
        xdg_config_home: Union[str, None],
        base_config_dir: str,
        overwritten_satpy_import_path: Union[str, None],
        satpy_init_path: str,
    ):
        if xdg_config_home is None:
            modified_env = {}
        else:
            modified_env = {"XDG_CONFIG_HOME": xdg_config_home}

        # Use stderr, because other print statements will confuse the `literal_eval`.
        # If an import is overwritten, then `overwrite_import` will output to stdout.
        command = [
            get_python_path(),
            "-c",
            "import sys, uwsift, satpy, satpy.readers\n"
            "print({'base_config_dir': uwsift.USER_CONFIG_PATHS[0], "
            "'overwritten_satpy_import_path': uwsift.config.get('satpy_import_path', None), "
            "'satpy_init_path': satpy.__file__})",
        ]
        working_dir = os.path.normpath(os.path.join(os.path.dirname(__file__), "..", ".."))
        process = subprocess.run(
            command, stdout=subprocess.PIPE, stderr=subprocess.DEVNULL, env=modified_env, cwd=working_dir, check=True
        )
        results = literal_eval(process.stdout.decode("utf-8"))

        # Can we configure the config file path with the environment variable `XDG_CONFIG_HOME`?
        assert results["base_config_dir"] == base_config_dir
        # Is the Satpy module path correctly read from the config file `general_settings.yml`?
        assert results["overwritten_satpy_import_path"] == overwritten_satpy_import_path
        # Where is the `__init__.py` from Satpy located?
>       assert results["satpy_init_path"] == satpy_init_path
E       AssertionError: assert '/usr/share/m...y/__init__.py' == '/home/runner...y/__init__.py'
E         - /home/runner/.cache/SIFT/workspace/temp/pytest-of-runner/pytest-0/test_config_satpy_import_path0/satpy/__init__.py
E         + /usr/share/miniconda3/envs/test-environment/lib/python3.8/site-packages/satpy/__init__.py

Any idea if this new output is expected?

nedelceo commented 11 months ago

Well I am not able to figure out which change in __init__.py is responsible for the AssertationError. Maybe adding import satpy?

djhoese commented 11 months ago

Due to the way the test is setup it is hard to tell which exact check if failing. The test will change some stuff, run a check, change some stuff, run a check, and so on, but we don't know which set of changed stuff is triggering it since the traceback doesn't seem to say that. Let me try downloading your branch and see if I understand the issue.

djhoese commented 11 months ago

Ok I made a commit (but github isn't currently showing it for some reason, we'll wait and see if it shows up today) that seems to fix the tests but I'm not happy about it. You/we were correct that the import satpy is messing things up, but that is actually what I wanted to get revealed by these changes. That is, changing sys.path to allow the user to customize import paths at import time is extremely fragile and I'd like to avoid this in the future if we can.

I have a meeting now, but I'll try to look at this more today.

djhoese commented 11 months ago

@ameraner What is the main use case for satpy_import_path? I found the documentation here:

https://sift.readthedocs.io/en/latest/configuration/external_satpy.html#replacing-satpy-by-external-installation

But is this mostly for developers that are running development Satpy with a pre-built SIFT application/bundle?

ameraner commented 11 months ago

But is this mostly for developers that are running development Satpy with a pre-built SIFT application/bundle?

Yes, exactly. This was before my time, but I don't think we requested it - it was implemented as a quick way to try out a dev-Satpy inside SIFT.

If it's causing problems, I think we could remove this - a developer/expert user most likely runs SIFT via a conda environment, where Satpy can be installed e.g. in pip-editable mode easily...

djhoese commented 11 months ago

I think I agree. Sure, yes, there are times when you have a SIFT tarball and you want to test a few hacks to see how it fixes things...but these are all things that can:

  1. Be done with modifying the internal .py files of a conda-pack'd environment now.
  2. Should require a unit test for any application type thing and should be done outside the tarball anyway.

Ok. I'll see how hard it is to remove the logic and the documentation.

djhoese commented 11 months ago

I know @arcanerr isn't working on the project, but as a developer maybe you have some memory of why the user is supposed to be able to customize the import path of the external Satpy package? Is there a use case other than developer's testing things in an already built release of the application?

djhoese commented 11 months ago

Ok I removed it and tried to do it all in one commit. The test was updated to basically now test only XDG_CONFIG_HOME which is a decently useful test in my opinion. It is still a lot of test code for a simple function though so I'd be open to cleaning it up even further if someone wanted.

djhoese commented 11 months ago

Any last issues with this before I merge it (later today)?

ameraner commented 11 months ago

LGTM, thanks for your contributions!

arcanerr commented 11 months ago

@ameraner What is the main use case for satpy_import_path? I found the documentation here:

https://sift.readthedocs.io/en/latest/configuration/external_satpy.html#replacing-satpy-by-external-installation

But is this mostly for developers that are running development Satpy with a pre-built SIFT application/bundle?

Sorry for not contributing to this discussion earlier, but having the option to use a different Satpy version than coming with a bundled SIFT software package was an explicit requirement by EUMETSAT.

I think, the use case was not only for SIFT developers but probably more for external users of SIFT who want to use a newer (or an older) Satpy version which might be better suited for some data to be worked with.

djhoese commented 11 months ago

@sjoro Any memory of this requirement mentioned above?

@arcanerr @ameraner In my opinion this shouldn't be allowed. There are too many chances for a version of Satpy that isn't tested with SIFT to break something and not work. I guess the idea is that SIFT will have a slow/delayed release cycle so users should be able to use new readers that have been developed in Satpy since the last release...I don't know. It is too hacky in my opinion.

arcanerr commented 11 months ago

I don't see why it shouldn't be allowed to exchange Satpy by configuration even if sometimes things break with a different version if this is clearly communicated (including: "loss of warranty", so to speak). It enables users to do things that they otherwise cannot do. Just my 2 cents.

On the other hand, the feature creates maintenance costs, so if it is considered as not required any more, removing it makes sense. And if it makes SIFT unstable even if the bundled Satpy is used, I agree, that must not be the case.

djhoese commented 11 months ago

Yeah, that's what my "...I don't know" was meant to convey. I like the blanket/generic feature request of "I should be able to use a new version of Satpy with my SIFT installation if I choose to". But the way it is/was implemented was only lucky that it didn't break. Mainly, that it depended on Satpy not being imported before the configuration's import path was read and applied/used. This lead to me trying to importlib.reload the Satpy modules and once I did that things just got weird. Doing something like modifying sys.modules and using reload are not things that will be fun to maintain in the future (especially when it is happening at import time).

Someone using a conda-based installation or development installation of SIFT can still upgrade their version of Satpy freely. The bundled SIFT distribution on the other hand is a different style of environment in my opinion. It is a bundled/guaranteed frozen set of SIFT and its dependencies and I don't think that should be manipulated or be allowed to reach outside of itself (import Python modules outside the bundle). That said, I could see a long term path in the future where SIFT's bundle allows updating of SIFT from within the same distribution. Kind of like any GUI application that has an "Update" button and just asks you to restart to use the new version once it is downloaded. This functionality can and probably will involve updating dependencies.

I'm curious who the original target audience was for the feature as that may skew my feelings about it. If it was me or Andrea or Ray (and since we switched to conda-pack bundles) and we just wanted it for testing, I'd say we hack in a python -m pip install -e /path/to/my/local/satpy...assuming pip is available inside conda-pack'd environments. It's a tarball so if we want the original version of Satpy back then we delete the extracted directory and re-extract it. If this feature is for people who don't have much programming experience...then I'm tempted to say they shouldn't be doing this at all. This type of feature may also be a sign that:

  1. The release of development bundles ("nightly" builds) was not frequent enough. Since these didn't exist during the EUMETSAT development cycles, this makes sense. This is something every commit to master does from GitHub Actions.
  2. The creation of development or conda-based installs was too difficult so people needing to do this weren't comfortable and wanted something that seemed simpler to them. This may be slightly easier now that it used to be, but is still generally an issue.
sjoro commented 11 months ago

hi all, a bit of background for this.

basically the EUM requirement for SIFT is/was "I should be able to use a new/another version of Satpy with my SIFT installation if I choose to". the implementation of this is left to the developers and in general, this is not something that should concern any "regular" users.

the reason behind this requirement is that we use SIFT for satellite/instrument commissioning, which is usually rather fast paced period where all the product validation and checks need to be completed. this is a 100% EUM activity during which we can't ecpect and rely on external parties, in this case Pytroll/Satpy/SIFT developers, to necessarily react, package and publish new updates to Satpy/SIFT in the timeframe required for commissioning. we have to have a way to bypass official Satpy repo with another version where we can control the environment. later on, once time allows, any beneficial changes to Satpy would be, of course, contributed back.

in short, this is a required risk management feature. with that, i also need to acknowledge that during commissioning i would only really expect problems with Satpy-readers as some of them are developed using test data. using external readers is already well supported by Satpy so using our own modified readers is easy and straightforward.

djhoese commented 11 months ago

@sjoro Who are the intended users inside EUMETSAT? How do they get the new/modified versions of Satpy? Would their installations be on shared systems where someone more technical (assuming the users are scientists/engineers) is doing the actual installation?

I ask because I'm wondering if a conda-based environment installation with a development/source install of Satpy (or install from EUMETSAT's internal copy of Satpy) would work as an alternative? This gives the flexibility that any other libraries that need updating (bug fixes in pyresample, etc) can also be updated quite easily. This would also keep the bundled/tarball installation as a "frozen" installation.

If this wouldn't work than I can re-add the import path functionality, but I'll be more careful about where and how it shows up to avoid future issues.

sjoro commented 11 months ago

@djhoese sorry for not replying, i was on holidays...

the intended users are scientists and/or engineers and we aim for having a central installation so that they do not need to worry about installation details. i would say that a conda-based environment with an option to do development/source install of Satpy is fulfilling the need perfectly. this gives us also an option to have separate environments for different users, if a specific version/branch os Satpy is needed for any particular reason and if need to be.

djhoese commented 11 months ago

is fulfilling the need perfectly

Your first sentence did not make me think this was going to be in your second sentence. I'm glad that that will be good enough. I think with the scripts we have and maybe with more advanced (possibly docker image-based) building processes it wouldn't be hard for someone on the EUMETSAT side to produce the distributed tarball with a development version of Satpy rather quickly if that ends up being easiest for the users you had in mine.

Ok so I think we've decided that the removal of the satpy import path configuration option is OK and we'll have other ways of dealing with it in the future. If it needs to be added back in the future we can talk about that at that time.

djhoese commented 10 months ago

Thanks @nedelceo! Merging now.