stac-utils / pystac

Python library for working with any SpatioTemporal Asset Catalog (STAC)
https://pystac.readthedocs.io
Other
344 stars 115 forks source link

Confusing extension detection #1391

Open soxofaan opened 3 weeks ago

soxofaan commented 3 weeks ago

Using pystac 1.10.1:

import pystac

stac_obj = pystac.Catalog(
    id="foo",
    description="Foo",
    stac_extensions=[
        "https://stac-extensions.github.io/datacube/v2.2.0/schema.json"
    ]
)

assert stac_obj.ext.has("cube")

stac_obj.ext.cube

the assert works, but the last line fails with

AttributeError: 'CatalogExt' object has no attribute 'cube'

Is that intended behavior?

To use catalog.ext.cube in a generic way properly I guess I have to guard it with an additional hasattr:

if stac_obj.ext.has("cube") and hasattr(stac_obj.ext, "cube"):
    x = stac_obj.ext.cube ...

I'd hoped that stac_obj.ext.has("cube") would be just enough to use as guard

(Note: I'm aware that a Catalog is not supposed to have the datacube extension enabled, but I want to make my code robust against slightly "invalid" STAC data too)

gadomski commented 3 weeks ago

I'm aware that a Catalog is not supposed to have the datacube extension enabled, but I want to make my code robust against slightly "invalid" STAC data too

I'd argue that pystac is working as intended. As an implementation of the spec and its extensions, the code provided (IMO) should reflect the valid use-case, with methods and functions to help bring invalid STAC into a valid state.

In this example, it's not clear what type catalog.ext.cube should even be. CollectionDatacubeExtension, ItemDatacubeExtension, AssetDatacubeExtension, and ItemAssetsDatacubeExtension are each unique implementations of DatacubeExtension for specific STAC object types. There's no equivalent CatalogDatacubeExtension to return from catalog.ext.cube.

Each extension implements its own checks to see if it is valid for an object type in the ext classmethod. Instead of your hasattr check, you could use ext:

try:
    cube = DatacubeExtension.ext(object)
except ExtensionTypeError:
    cube = None

This still won't get you datacube extension information for a catalog, however. For that, you'll need to manipulate the catalog's attributes directly via extra_fields: dict[str, Any]

soxofaan commented 3 weeks ago

In this example, it's not clear what type catalog.ext.cube should even be.

I agree, that's not the thing. My point is if catalogs don't support the (data)cube extension per STAC spec, it's weird to still get True from catalog.ext.has("cube")

gadomski commented 3 weeks ago

My point is if catalogs don't support the (data)cube extension per STAC spec, it's weird to still get True from catalog.ext.has("cube")

It's a good point. has_extension just looks at the stac_extensions field, without considering whether those extension urls are valid when applied to that object.

I don't have a good solution to the problem in a code sense, so maybe better documentation on has is the answer?

jsignell commented 1 week ago

I guess we could have has do more, but like @gadomski said, it is just a different invocation of has_extension. If we were to extend it to check validity then the response would not be True or False in the case you describe - it would be an error.

If you are just trying to get the cube extension class you don't really need has at all:

if hasattr(stac_obj.ext, "cube"):
    x = stac_obj.ext.cube ...

or

x = getattr(stac_obj.ext, "cube", None)