JuliaClimate / STAC.jl

SpatioTemporal Asset Catalog (STAC) julia client
MIT License
29 stars 4 forks source link

Not working with the GEOBON STAC catalogue #17

Closed tpoisot closed 1 month ago

tpoisot commented 3 months ago

The following is empty:

using STAC
catalog = STAC.Catalog("https://stac.geobon.org/")
entry = catalog["chelsa-clim"]

Specifically, the output is

julia> entry = catalog["chelsa-clim"]
chelsa-clim
CHELSA Climatologies
CHELSA Climatologies
License: CC-BY-SA-4.0

But using the URL gives a list of the expected items withing this collection: https://stac.geobon.org/collections/chelsa-clim/items

Is there something I am missing?

Alexander-Barth commented 3 months ago

Thanks for the reproducer. The code is looking for item (without the endings s):

https://github.com/JuliaClimate/STAC.jl/blob/e163edb7728c8e62c20c461fe775ab25f6302e63/src/catalog.jl#L118

In stac.geobon.org's JSON however, the rel field has the value items (with an s):

julia> entry = catalog["chelsa-clim"];

julia> entry.data.links
4-element JSON3.Array{JSON3.Object, Base.CodeUnits{UInt8, String}, SubArray{UInt64, 1, Vector{UInt64}, Tuple{UnitRange{Int64}}, true}}:
 {
    "rel": "items",
   "type": "application/geo+json",
   "href": "https://stac.geobon.org/collections/chelsa-clim/items"
}
 {
    "rel": "parent",
   "type": "application/json",
   "href": "https://stac.geobon.org/"
}
 {
    "rel": "root",
   "type": "application/json",
   "href": "https://stac.geobon.org/"
}
 {
    "rel": "self",
   "type": "application/json",
   "href": "https://stac.geobon.org/collections/chelsa-clim"
}

See the spec: https://github.com/radiantearth/stac-spec/blob/2e6947df194a707354d6d5be1da6762647ee593f/commons/links.md#link-object

and example:

https://github.com/radiantearth/stac-spec/blob/2e6947df194a707354d6d5be1da6762647ee593f/examples/collection.json#L93

Does the spec also mentions items in plural for the rel field?

tpoisot commented 3 months ago

Thanks @Alexander-Barth -- you are right, neither the spec nor the types for link relations in IANA use items

@jmlord / @glaroc -- it seems that the GEOBON STAC catalogue isn't following the spec

glaroc commented 3 months ago

We are using STAC Fastapi. This might be a difference between STAC spec and STAC API spec ? Other catalogs I'm looking at use items with an "s".

https://planetarycomputer.microsoft.com/api/stac/v1/collections/terraclimate https://eocat.esa.int/eo-catalogue/collections

tpoisot commented 3 months ago

Yeah, the STAC API spec is different from the STAC spec 😕

Alexander-Barth commented 3 months ago

That would be a real bummer. The STAC spec is already quite extensive as is. Or maybe it is an extension (https://stac-extensions.github.io/) or a beta version of the spec (just a speculation). Maybe the raising an issue at https://github.com/radiantearth/stac-spec/ would help?

glaroc commented 3 months ago

In any case, if catalogs like the Planetary Computer and ESA have items instead of item, this library should probably support it. Otherwise, many users will face similar issues. The rstac R package and the Python pystac library work fine with our catalog.

Alexander-Barth commented 3 months ago

Are you able to read the items with pystac? This is what I tried, but I got also an empty list for the items.

import pystac
root_catalog = pystac.Catalog.from_file("https://stac.geobon.org/")

root_catalog.get_child("chelsa-clim").title
# output 'CHELSA Climatologies'

list(root_catalog.get_child("chelsa-clim").get_items())
# output []

list(root_catalog.get_child("chelsa-clim").get_items(recursive=True))
# output []

pystac.__version__
# output '1.10.1'

I am not very familiar with pystac, I followed the documentation here https://pystac.readthedocs.io/en/stable/quickstart.html#Crawling-Items

glaroc commented 3 months ago

@Alexander-Barth I believe you have to use pystac_client: https://pystac-client.readthedocs.io/en/latest/usage.html

Static and dynamic catalogs have different characteristics. For example, sub-catalogs are not allowed with the STAC API Spec, while they are in the STAC Spec. I guess items vs item is another difference.

Alexander-Barth commented 3 months ago

This seems to be relevant:

https://github.com/radiantearth/stac-api-spec/blob/604ade6158de15b8ab068320ca41e25e2bf0e116/overview.md?plain=1#L40

Alexander-Barth commented 1 month ago

This should be fixed in the current main branch:

julia> using STAC
Precompiling STAC
  1 dependency successfully precompiled in 3 seconds. 41 already precompiled.

julia> catalog = STAC.Catalog("https://stac.geobon.org/")
geobon-stac
BON in a Box STAC
Spatio Temporal Asset Catalog for layers used in BON in a Box, courtesy of GEO BON.
Children:
   * chelsa-clim: CHELSA Climatologies
   * gfw-lossyear: Global Forest Watch - Loss year
   * soilgrids: Soil Grids datasets
   * ghmts: Global Human Modification of Terrestrial Systems
   * gfw-treecover2000: Global Forest Watch - Tree cover 2000
   * gfw-gain: Global Forest Watch - Gain
   * chelsa-monthly: CHELSA monthly timeseries
   * esacci-lc: ESA Land cover time series
   * chelsa-clim-proj: CHELSA Climatologies Projections
   * silvis: Silvis Dynamic Habitat Indices
   * colombia-lc: Colombian land cover time series
   * earthenv_topography: EarthEnv - Topography
   * earthenv_landcover: EarthEnv - Consensus Land Cover - Full version
   * earthenv_habitat_heterogeneity: EarthEnv - Habitat heterogeneity
   * accessibility_to_cities: Accessibility - Time to access cities
   * colombia_forests: Time series of the presence of forests in Colombia
   * qc_pilot_env: Environmental layers for the Quebec pilot BON optimization project.
   * global-mammals: Global habitat availability for mammals from 2015-2055
   * fragmentation-rmf: Relative Magnitude of Fragmentation (RMF)
   * colombia-human-footprint: Colombia - Human Footprint
   * colombia-protected-areas: Colombia - Protected Areas
   * earthenv_topography_derived: EarthEnv - Derived topographic categorical variables
   * distance_to_roads: Distance to roads
   * cec_land_cover: CEC North American Land Cover
   * gbif_heatmaps: Occurrence density maps created from GBIF data
   * cec_land_cover_percentage: CEC North American Land Cover Percentage at 300 m resolution
   * cec_derived_maps: Rasters derived from CEC North American Land Cover map to serve as inputs for various pipelines.
   * ncp_cna: Nature's Contribution to People - Critical Natural Assets

julia> entry = catalog["chelsa-clim"]
chelsa-clim
CHELSA Climatologies
CHELSA Climatologies
License: CC-BY-SA-4.0
Items:
   * bio9
   * bio8
   * bio7
   * bio6
   * bio5
   * bio4
   * bio3
   * bio2
   * bio19
   * bio18
   * bio17
   * bio16
   * bio15
   * bio14
   * bio13
   * bio12
   * bio11
   * bio10
   * bio1

julia> entry.items["bio1"]
bio1
bounding box:
    ┌────── 83.99986───────┐
    │                      │
-180.00014              179.99986
    │                      │
    └──────-90.00014───────┘

date time: 1981-01-01T00:00:00
Assets:
   * bio1

julia> entry.items["bio1"].assets["bio1"]
type: image/tiff; application=geotiff; profile=cloud-optimized
href: https://object-arbutus.cloud.computecanada.ca/bq-io/io/CHELSA/climatologies/CHELSA_bio1_1981-2010_V.2.1.tif

STAC.jl should now also support the item-search extension.

tpoisot commented 1 month ago

AMAZING! Thank you so much!!!!

Alexander-Barth commented 1 month ago

Great! I am closing the issue and will soon make a new release.