radiantearth / stac-spec

SpatioTemporal Asset Catalog specification - making geospatial assets openly searchable and crawlable
https://stacspec.org
Apache License 2.0
779 stars 178 forks source link

Evolve best practices on URLs #1325

Open m-mohr opened 4 days ago

m-mohr commented 4 days ago

The best practices on https://github.com/radiantearth/stac-spec/blob/v1.0.0/best-practices.md#use-of-links seem partially ambiguous and may have evolved a bit.

We have the following in the best practices:

The approach that I recommend is:

STAC Browser also recommends what I recommend (unsurprisingly ;-) ) or the Absolute published catalog. Those two generally work best for STAC Browser as it can generate absolute self links for an entity without requesting the root via additional request. It seems that this type is not reflected in the best practices though. I feel like the best practices should evolve a bit and may even be simplified.

The cases above come up very naturally, the difference is usually the self links. Everything else should just be consistently being used for STAC links and assets. Depending on the given hosting context the assets are also often just absolute.

Thoughts? Otherwise, I'll try to create a PR soon...

gadomski commented 3 days ago

The approach that I recommend is:

  • Absolute self link in all files
  • Relative STAC links
  • Relative assets unless hosted externally

I agree with this except I prefer absolute asset hrefs. I find that STAC is much more mobile than assets. You copy STAC down into some local data store, you load them into a backend, whatever, but the assets stay put. I even think that co-hosting STAC and assets might (hot take alert 🌶️ ) be an anti-pattern.

I don't think self-contained catalogs are very useful at all (I think @matthewhanson agrees with me on this but could be mis-remembering) and would be 👍🏼 on removing them from the best practices. I don't love "Absolute Published Catalogs" either, I don't really see what advantage they have over the Matthias's Preferred Approach™ — the tooling is generally good enough to resolve relative links, so I think the complexity win of removing "Absolute Published Catalogs" is worth it.

So, to summarize:

I think there's an advantage in having "one preferred way" in the Best Practices, with some discussion on why you might tweak, e.g. why you would choose absolute or relative asset hrefs.

jbants commented 2 days ago

I agree with best practices that assets should stay put and require an absolute link. STAC is a signpost to the data, after all. Removing Self-contained and Absolute published catalogs and changing Relative published catalogs to MPC (I wonder if that acronym is taken) seems like a good idea. The spec doesn't prohibit these types of catalogs; they're just not optimal.

m-mohr commented 2 days ago

Haha, I guess we can find a better name :-P

Thanks for your inputs.

I don't feel strongly about rel vs abs assets. As you two advoate for it, what's the benfit of having an absolute links? I don't find a good reason yet why this should necessarily be absolute if there's an absolute self link. Could you elaborate on that? @gadomski @jbants

gadomski commented 1 day ago

As you two advoate for it, what's the benfit of having an absolute links?

I think the benefit is as a signalling mechanism, i.e. how we think STAC should be used. By recommending absolute asset hrefs but relative links, we tell folks "you can and should STAC around as much as you want, but you probably want to keep your assets in one place." Relative asset hrefs make it feel like the STAC should live next to the assets, which (at least to me) feels a little wrong.

m-mohr commented 10 hours ago

Okay, so it's the moving aspect. On the other hand, for me relative links would keep the maintenance burden a bit down as you only have to update one link (self links). Maybe we don't recommend anything and leave it up to users? I only see minor pros/cons so far for absolute/relative links respecitively, so maybe not worth a best practice yet?