Open-EO / openeo-geopyspark-driver

OpenEO driver for GeoPySpark (Geotrellis)
Apache License 2.0
26 stars 4 forks source link

avoid using local paths as "href" #926

Open soxofaan opened 1 week ago

soxofaan commented 1 week ago

in various places (mainly around batch job result/asset handling) we are putting local (absolute) paths in a "href" field. This is quite confusing as there is no indication that this is about local paths (e.g. at least a "file://" protocol would make that more explicit). It's also not ideal that at some point that "href" probably has to be overwritten with a relative/absolute (HTTP/S3) URL to be usable by end user and the original location is lost. Overall it makes it hard to reason about how assets are handled in our code

bossie commented 1 week ago

It's clear from the STAC spec that "href" should be a URI with a scheme so I would even call this a bug.

Re: technical debt I would also avoid passing URIs around as strings and instead use a dedicated type as it will probably get rid of a great deal of implicit assumptions and workarounds like (think // vs /).

I haven't found a dedicated URI class in Python like java.net.URI; maybe we could look into https://pypi.org/project/uri/. Automatic translation between the two by py4j should be trivial.

soxofaan commented 2 days ago

another symptom: https://github.com/Open-EO/openeo-geopyspark-driver/pull/929#discussion_r1838256145