microsoft / PlanetaryComputerExamples

Examples of using the Planetary Computer
MIT License
366 stars 179 forks source link

Querying the STAC API for several models at the same time #222

Closed ivanacvijanovic closed 1 year ago

ivanacvijanovic commented 1 year ago

Hi,

I am new to planetary computer and I am looking for an advice how to load several climate models at the same time. I am using nasa-nex-gddp-cmip6 data and I found this example for one model:

search = catalog.search( collections=["nasa-nex-gddp-cmip6"], datetime="1950/2000", query={"cmip6:model": {"eq": "ACCESS-CM2"}}, ) items = search.get_all_items() len(items)

However I need more than just "ACCESS-CM2'. For example also: 'ACCESS-ESM1-5', 'BCC-CSM2-MR', 'CESM2', 'CESM2-WACCM'. Could you please let me know of the syntax to load all or at least several climate models at the same time?

Many thanks,

Ivana

TomAugspurger commented 1 year ago

You would want

    query={
        "cmip6:model": {"in": ["ACCESS-ESM1", "ACCESS-CM2"]},
    },

This uses the fragment extension, where you can find the full list of operators / values.

In [3]: search = catalog.search(
   ...:     collections=["nasa-nex-gddp-cmip6"],
   ...:     datetime="1950/2000",
   ...:     query={"cmip6:model": {"in": ["ACCESS-ESM1", "ACCESS-CM2"]}},
   ...: )
   ...:
   ...: len(search.item_collection())
Out[3]: 51
ivanacvijanovic commented 1 year ago

Thank you very much. By the same logic, would it be possible to query at same given models and given variables (it does not seem to work):

query={"cmip6:model": {"in": ["ACCESS-CM2", "ACCESS-ESM1-5", "CESM2"]}, "cmip6:variable": {"in": ["tasmax", "tas"]} }

TomAugspurger commented 1 year ago

What would querying by variable mean here, and what's the motivation for querying on it? Something like "only return the times with a tasmax variable?

That isn't possible with the STAC items as they are today, since you can only query on item properties and things like tasmax are the keys of the assets. We could add some kind of variables field like ["tasmax", "tas", ...] with the list of variables. I'm not sure whether the STAC API specification supports queries on lists though (like "contains any" or "contains all").

For now, you can achieve this in python with something like

items = catalog.search(...)
keep = [item for item in items if item.assets & {"tasmax", "tasmin"}]
TomAugspurger commented 1 year ago

This will need to be fixed in the upstream stactools package. I opened https://github.com/TomAugspurger/nasa-nex-gddp-cmip6/issues/1 to track it.