stac-utils / pystac-client

Python client for searching STAC APIs
https://pystac-client.readthedocs.io
Other
161 stars 48 forks source link

Collection search #735

Closed hrodmn closed 2 weeks ago

hrodmn commented 1 month ago

Related Issue(s):

Description: Collection discovery is a challenge in the current environment. A user might know which catalog or API they want data from but they do not know have the actual collection_id that they will need to perform an item-level search. The STAC API Collection Search Extension makes it possible for a user to search apply filters to collection-level metadata. This is most useful when the STAC API Free Text Extension is enabled because a user can search an API for all collections with a term like q=DEM to find all collections that have the term DEM in the title, description, or keywords.

Since most APIs do not currently have the collections earch extension enabled, I added some client-side filtering logic to make the CollectionSearch class request the full list of collections from the /collections endpoint then apply a limited set of filters (datetime, bbox, q) to the list.

PR Checklist:

hrodmn commented 1 month ago

I added the Client.collection_search method but now I wonder if it would make more sense to add the optional filter args to Client.collections instead since that would follow the pattern from the STAC API a bit more closely. When the collection search extension is enabled, you perform a collection search by adding query parameters like bbox and q to GET requests on the /collections endpoint.

codecov-commenter commented 1 month ago

Codecov Report

Attention: Patch coverage is 91.21951% with 18 lines in your changes missing coverage. Please review.

Project coverage is 93.68%. Comparing base (21435b0) to head (07e5187). Report is 81 commits behind head on main.

Files with missing lines Patch % Lines
pystac_client/collection_search.py 91.60% 11 Missing :warning:
pystac_client/item_search.py 77.77% 4 Missing :warning:
pystac_client/cli.py 86.95% 3 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #735 +/- ## ========================================== + Coverage 93.43% 93.68% +0.25% ========================================== Files 13 15 +2 Lines 990 1188 +198 ========================================== + Hits 925 1113 +188 - Misses 65 75 +10 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

hrodmn commented 3 weeks ago

I think the last big thing to add here is a cli method for collection search. @gadomski what do you think about adding the search args to the collections method in the CLI? I could also add a collection-search method, but filter parameters to the existing collections method would fit naturally with the STAC API experience (e.g. /collections?q=sentinel).

gadomski commented 3 weeks ago

what do you think about adding the search args to the collections method in the CLI?

Yup, makes sense to me!

hrodmn commented 2 weeks ago

Taking a look at the CI errors, looks like you'll need to use pytest.warns to catch-and-assert the client-side filtering warnings.

Argh, yeah. I need to start running scripts/test instead of pytest.

Thanks for the review, I'll get those changes in today!

gadomski commented 2 weeks ago

I need to start running scripts/test instead of pytest.

or develop an allergic reaction to all warnings, like I have (don't recommend leads to lots of yak shaving) :-)

hrodmn commented 2 weeks ago

@gadomski thanks for your reviews, sorry for not catching those little CI issues and for half-accepting your suggestion on matched!