icesat2py / icepyx

Python tools for obtaining and working with ICESat-2 data
https://icepyx.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
215 stars 107 forks source link

Delegate full granule query to earthaccess, keep smart subsetting logic in icepyx #575

Open weiji14 opened 2 months ago

weiji14 commented 2 months ago

To reduce the confusion of ICESat-2 users in choosing whether to use icepyx or earthaccess to search for ATLAS products, this issue attempts to discuss a path forward to have proper separation of concerns in both libraries. Talking to @JessicaS11 and others at the 2024 UW Hackweek, we propose to change the 'search' workflow to:

  1. For 80% of the users who wants an easy entrypoint -> Delegate full granule search and download to earthaccess. Specifically use earthaccess to return the Granule IDs from the NSDIC DAAC, and connect/open that (ideally cloud-hosted) dataset via fsspec.
  2. For 20% of the users who want to do more advanced search -> Use earthaccess to look for the Granule IDs, and icepyx (as a plugin/extension) handles the smart open_subset logic.
    • a) Transition phase icepyx=v1.x - Have the variable subsetting logic remain in icepyx only, without integration into earthaccess. Users use this interface with a mix of earthaccess and icepyx
    • b) Breaking phase icepyx=v2.x - Allow users to install icepyx as an extension of earthaccess. Users will interface directly using earthaccess.smart_open, xref https://github.com/nsidc/earthaccess/issues/328.

Taking a stab this week as part of https://github.com/ICESAT-2HackWeek/icepyx/issues/5

image

mfisher87 commented 2 months ago

:100:

weiji14 commented 2 months ago

Implementation-wise, I'm gonna try to increase code coverage on query.py and granules.py in #581 first, and then refactor ipx.Query's internal logic to use earthaccess's functions/methods to get the granule list. Intent is to try to retain backwards compatibility as much as possible in icepyx=1.x, and then make any breaking changes in icepyx=2.x.