sat-utils / sat-search

A python client for sat-api
MIT License
188 stars 43 forks source link

Constrained search from Pangeo example returns too many results. #127

Closed metasim closed 2 years ago

metasim commented 2 years ago

This is from the Pangeo Landsat 8 Tutorial (which I know is not your product), but it's a basic query, and it seems that regardless of the input parameters, I get the warning

There are more items found (34678) than the limit (10000) provided.

I further constrained the search by providing the scene ID that's used further down, and I got the same number of results. There are also S2 scenes mixed in with L8, so something bigger seems to be amiss.

Any ideas?

Relevant Code

bbox = (-124.71, 45.47, -116.78, 48.93) #(west, south, east, north) 

timeRange = '2019-01-01/2020-10-01'

# STAC metadata properties
properties =  ['eo:row=027',
               'eo:column=047',
               'landsat:tier=T1'] 

results = Search.search(url = 'https://earth-search.aws.element84.com/v0',
                        collection='landsat-8-l1', 
                        id='LC80470272019096', # <-- I added this to constrain the search further
                        bbox=bbox,
                        datetime=timeRange,
                        property=properties,
                        sort=['<datetime'],
                        )
matthewhanson commented 2 years ago

Hello @metasim, looks like there's a couple issues here. First, you'll want to switch from sat-search to pystac-client, https://github.com/stac-utils/pystac-client which is a replacement with almost the same syntax. The pangeo tutorial predates pystac-client.

The collection and id keywords should actually be collections and ids and are arrays.

Finally, the Landsat data in Earth-search is for a deprecated open dataset (Landsat Collection 1), but luckily USGS has a new official Landsat STAC API up at https://landsatlook.usgs.gov/stac-server

So putting this together to query Landsat Level 2 surface reflectance data would look like:

bbox = (-124.71, 45.47, -116.78, 48.93) #(west, south, east, north) 

timeRange = '2019-01-01/2020-10-01'

# STAC metadata properties
properties =  ['eo:row=027',
               'eo:column=047',
               'landsat:tier=T1'] 

client = Client.open('https://landsatlook.usgs.gov/stac-server')

search = client.search(
                        collections=['landsat-c2l2-sr'], 
                        bbox=bbox,
                        datetime=timeRange,
                        property=properties,
                        sort=['<datetime'],
                        )

print(f"{search.matched()} items found")

items = search.get_all_items()

Finally, you may want to take a look at the odc-landsat notebook in this repo: https://github.com/Element84/geo-notebooks as a newer alternative way of getting data through the use of OpenDataCube with STAC.

metasim commented 2 years ago

@matthewhanson Thanks for the quick feedback! That's all good to know. TBH, I was just trying out the Pangeo notebook just to see what the assemblage of Python libraries could do. Glad to know about your own notebooks and that ODC works with STAC.