Closed nikki-t closed 1 month ago
This is a (painful) way of determining available services for a given collection using graphql: https://nasa-openscapes.github.io/2021-Cloud-Hackathon/tutorials/07_Harmony_Subsetting.html#discover-service-options-for-a-given-data-set
Harmony also provides a capabilities endpoint to determine available services: https://harmony.earthdata.nasa.gov/docs#available-services
I was looking through python_cmr for something else and came across a set of functions for "Tool and Variable Service CMR Queries" starting on this line. Not sure if they'll help but figured it's worth a check to see if they've already done some of the work for us...
I wonder if we should be thinking about how this kind of query might be used and by what/whom?
My current thinking is... If a python tool is returning information about services then that information should be able to be used by a tool (the same tool or a different one). I'm thinking of a pipeline...
result = earthaccess.search_datasets(...)
if "harmony" in result.services:
subset = result.harmony.subsetter().to_file(name_of_file) # uses spatial and temporal bounds from query for subsetting
else:
earthaccess.download(result)
I can also see a case for a user querying services from a notebook: for example you have found the dataset you want and you want to know if you have to download/access a complete file or is you can use a service.
I think beyond that, a lot of discovery for services and options would be done via user guides and other web-hosted information.
I think beyond that, a lot of discovery for services and options would be done via user guides and other web-hosted information.
This was part of our hope via a plugin interface (#328). Sort of like Xarray can discover and use whatever backends you have installed in your environment, earthaccess can discover and use whatever services/subsetters are available via your installed libraries, so long as those libraries have set up the required plugin functionality. This takes the onus off earthaccess to actually implement/maintain specific interfaces (except, perhaps, with a system like harmony) but makes it easy for users to access those other tools through earthaccess in a predictable way.
@andypbarrett and @JessicaS11 - I think you both bring up some great points around the use of services so I put together a mini roadmap for implementing service information with an eye towards implementing a plugin interface.
Requirements analysis: How would a user approach searching for a service?
Proposed code design focusing on bullet 1
Nice to have or future work (based on user need)
As a first step to facilitating the use of services in earthaccess, we should modify earthaccess so that it can list the available services for a collection.
Link to Harmony Documentation: https://harmony.earthdata.nasa.gov/docs Link to CMR API documentation on services: https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html#service
This would allow earthaccess to return a list of services for a collection so that we can integrate future work on service usage into the codebase. Related issue: https://github.com/nsidc/earthaccess/issues/328