Open xaviernogueira opened 2 weeks ago
Inspired by the broader, but GDAL dependent, work above
GDAL is not a necessary dependency of the above stac-geoparquet implementation. GeoPandas is included as a dependency because it's the most common library that people historically have used for this, but we don't use any GDAL-related functionality, and it would easily be able to be removed.
allowing
stac-pydantic
models to be passed into functions
The main question is: with your pydantic models, do you know a static schema for the entire collection? I doubt you do, because I assume every Item is validated independently for pydantic. Assuming this, there is not really any benefit to integrating with pydantic, because we have to do our own columnar schema inference anyways.
@kylebarron good points.
On my side I'll probably just quickly support the lightweight conversion bc it's in the scope of this project, and I'll leave it at that. I stand by my pydantic
+ pyarrow
only vision, even if it's just me being wierd about things lol.
My only counterpoint on the stac-pydantic
topic is that since you already suppor a Union[pystac.Item, dict[str, Any]
, it would just be another elif
clause here for the arrow conversion (as an example).
That said, going a step further one could say you are already treating the dicts like objects anyways, with expected keys (where missing would throw KeyError
), and a variety of derived properties like self_href
that are intrinsically linked with the schema be attached to the model class.
I see many opportunities to consolidate a lot of logic scattered around here in STAC pydantic
models...but that's just food for thought from my bias. I may throw up a PR more as a conversation starter.
See write up here: https://cloudnativegeo.org/blog/2024/08/introduction-to-stac-geoparquet Spec here: https://github.com/stac-utils/stac-geoparquet
Inspired by the broader, but GDAL dependent, work above I would like to add the
stac-pydantic
](https://github.com/stac-utils/stac-pydantic) models as a valid input <> STAC specific subclass ofGeoParquetMetadata
(if necessary).In addition (as there is value to a non-GDAL dependent workflow), I was wondering what the maintainers of
stac-geoparquet
think about allowingstac-pydantic
models to be passed into functions (as opposed to justdict
objects currently) thoughts @kylebarron, @cholmes, @TomAugspurger?In my view, the projects can exist together, as the whole point of this one is a super lightweight
pydantic
focused tool. Broadly speaking my vision is being able to "live in a validated world" where one has no need for leaving validatedpydantic
models.