radiantearth / stac-spec

SpatioTemporal Asset Catalog specification - making geospatial assets openly searchable and crawlable
https://stacspec.org
Apache License 2.0
796 stars 178 forks source link

absolute/relative references #373

Closed fredliporace closed 5 years ago

fredliporace commented 5 years ago

See #160 and #56 for related discussion.

Definitions with examples:

Pros/Cons:

I'd say that if I had to choose one model I'd go with absolute path, with a self reference optionally including an absolute url (permalink) if this information is available when creating the STAC files.

Update 02/04/2019

My feeling from the meeting is that any constraint on the reference types will have impact in some relevant use case. Maybe in that scenario we should make it flexible, even if that increases complexity on the client.

For instance, allow any kind of reference and only enforcing absolute URL (preferable) or absolute path for self, making self optional to deal with the case when there is no way to obtain the absolute reference/URL.

Update 02/16/2019

My current proposal: recommend relative links for everything except self. This supports the 'copy set/subset' use case described above. Since we want to support data that is not going online we need to change the self definition: Possible alternatives:

m-mohr commented 5 years ago

My current proposal: recommend relative links for everything except self. This supports the 'copy set/subset' use case described above. Since we want to support data that is not going online we need to change the self definition: Possible alternatives:

Yes, I agree and I think relative is the most universal and useful solution.

  • Making it optional and keep the absolute url requirement
  • Keeping it required but allowing absolute references for the 'not online' use case

I'd say "Make it optional and require either absolute path or absolute url". Here's why:

cholmes commented 5 years ago

I'm leaning towards relative links as well, with a few caveats to explore. In particular I'd like to consider the API use case, and if we can just require absolute links there. And if we explicitly call out an 'online static catalog' that does have absolute URL's, though perhaps we just put them at the 'collection' level instead of in an item.

I do definitely lean towards 'make it optional' for self link - I think a self link with a relative path is not useful, so let's just leave it off.

m-mohr commented 5 years ago

I can't quite follow what you are describing for the API use case. Why does the API use case require an absolute URL?

fredliporace commented 5 years ago

I'd say "Make it optional and require either absolute path or absolute url". Here's why:

  • First use-case: Static catalog (NOT online) -> absolute path (but not sure whether it is helpful for anything)
  • Second use case: Static catalog (online) -> absolute url sounds very helpful
  • Third use case (biased by my openEO view): Collection only API such as GEE or openEO (discovery), or full catalogs with items that can be downloaded from a protected (non-permanent) user workspace (openEO results): self link not required or not available.

I'm OK with that.

fredliporace commented 5 years ago

@cholmes I'm currently exposing the static catalog through API and it would be better to use the same document for both static and API. In that case forcing absolute urls for API would result in also using absolute urls for static.

m-mohr commented 5 years ago

@fredliporace Couldn't you simply add the link in the API? Or isn't that going through some server-side processing?

fredliporace commented 5 years ago

Well, while working on this answer I guess I understood @cholmes 's concern. My current development implementation for the API has the following address, which will be changed for production:

https://4jp7f1hqlj.execute-api.us-east-1.amazonaws.com/prod/stac/search/

A sample of current returned data is:

{
  "type": "FeatureCollection",
  "features": [
    {
      "id": "CBERS_4_MUX_20190203_162_119_L4",
      "type": "Feature",
      "geometry": {
        "type": "MultiPolygon",
        "coordinates": [
          [
            [
              [
                -52.981499,
                -17.447696
              ],
              [
                -51.850932,
                -17.618832
              ],
              [
                -51.60912,
                -16.556977
              ],
              [
                -52.733004,
                -16.386859
              ],
              [
                -52.981499,
                -17.447696
              ]
            ]
          ]
        ]
      },
      "bbox": [
        -52.986338,
        -17.61936,
        -51.606218,
        -16.371447
      ],
      "properties": {
        "datetime": "2019-02-03T13:25:31Z",
        "eo:sun_azimuth": 94.5659,
        "eo:sun_elevation": 57.8094,
        "eo:off_nadir": -0.00913168,
        "eo:epsg": 32751,
        "cbers:data_type": "L4",
        "cbers:path": 162,
        "cbers:row": 119
      },
      "links": [
        {
          "rel": "self",
          "href": "https://cbers-stac-0-6.s3.amazonaws.com/CBERS4/MUX/162/119/CBERS_4_MUX_20190203_162_119_L4.json"
        },
        {
          "rel": "parent",
          "href": "https://cbers-stac-0-6.s3.amazonaws.com/CBERS4/MUX/162/catalog.json"
        },
        {
          "rel": "collection",
          "href": "https://cbers-stac-0-6.s3.amazonaws.com/collections/CBERS_4_MUX_collection.json"
        }
      ],
      "assets": {
        "thumbnail": {
          "href": "https://s3.amazonaws.com/cbers-meta-pds/CBERS4/MUX/162/119/CBERS_4_MUX_20190203_162_119_L4/CBERS_4_MUX_20190203_162_119.jpg",
          "type": "image/jpeg"
        },
        "metadata": {
          "href": "s3://cbers-pds/CBERS4/MUX/162/119/CBERS_4_MUX_20190203_162_119_L4/CBERS_4_MUX_20190203_162_119_L4_BAND6.xml",
          "title": "INPE original metadata",
          "type": "text/xml"
        },
        "B5": {
          "href": "s3://cbers-pds/CBERS4/MUX/162/119/CBERS_4_MUX_20190203_162_119_L4/CBERS_4_MUX_20190203_162_119_L4_BAND5.tif",
          "type": "image/vnd.stac.geotiff; cloud-optimized=true",
          "eo:bands": [
            0
          ]
        },
        "B6": {
          "href": "s3://cbers-pds/CBERS4/MUX/162/119/CBERS_4_MUX_20190203_162_119_L4/CBERS_4_MUX_20190203_162_119_L4_BAND6.tif",
          "type": "image/vnd.stac.geotiff; cloud-optimized=true",
          "eo:bands": [
            1
          ]
        },
        "B7": {
          "href": "s3://cbers-pds/CBERS4/MUX/162/119/CBERS_4_MUX_20190203_162_119_L4/CBERS_4_MUX_20190203_162_119_L4_BAND7.tif",
          "type": "image/vnd.stac.geotiff; cloud-optimized=true",
          "eo:bands": [
            2
          ]
        },
        "B8": {
          "href": "s3://cbers-pds/CBERS4/MUX/162/119/CBERS_4_MUX_20190203_162_119_L4/CBERS_4_MUX_20190203_162_119_L4_BAND8.tif",
          "type": "image/vnd.stac.geotiff; cloud-optimized=true",
          "eo:bands": [
            3
          ]
        }
      }
    }
  ]
}

I'm currently using absolute links. If I were using relative links it would not be possible to follow the resulting links directly. The browser would have to build the link based on 'self' and then applying the relative information.

I'm documenting this use case in #401.

m-mohr commented 5 years ago

Why would we not be able to follow them? A client just needs to resolve the relative links against the URL it requested, I guess. Of course, if links point to another server or so, you'd need absolute. So we must allow absolute url + relative url for all links, except self, which is optional and must be an absolute url (and maybe absolute path, but still not sure how useful that is). In the end it seems we are just allowing what the WWW/HTML allows, which works for ages. ;)

fredliporace commented 5 years ago

In that case the requested URL would be

https://4jp7f1hqlj.execute-api.us-east-1.amazonaws.com/prod/stac/search/

and the parent relative link would be something like

../catalog.json

so that would not be a simple concatenation of requested URL and the relative link. This kind of resolution works well for static pages, but not quite if you use something like stac search api.

m-mohr commented 5 years ago

Well, I expected that an API would generate meaningful relative URLs and not just pass through the URLs from the static catalog. Passing the URLs through will never work with relative links in the API, of course.

cholmes commented 5 years ago

@fredliporace - I don't think using the same document for static and API is the way to go. What I'm leaning towards is that 'self-contained catalogs' are static ones that follow a recommendation to have all relative links. And then I'm thinking of even going so far as to try to require that STAC API's return absolute self links. Like @m-mohr I expect an API to generate its URL's. And to most always have the API url's be absolute.

Indeed I think there could be a recommendation that an API that is powered by a static API would use a rel link to point back to the place the static catalog lives. Perhaps even use rel=canonical, to say that 'this is the core location that this item lives at'.

mojodna commented 5 years ago

A few weeks back, I'd vaguely proposed a sidecar that would facilitate using relative URLs for all links (including self), including when a catalog has been copied from its original location. Originally, I was thinking that it would need to exist for each sub-catalog and that both parent catalogs and associated sidecars would need to be read/navigated in order to resolve a URL for a child or item.

However, since we're encouraging publishers to include both parent and root rels, a minor addition to the root link will allow us to resolve self with only a single sidecar file (and read; no need to navigate or read parent catalogs). If we include the reverse link (i.e. if root's href is ../catalog.json, the reverse might be 12/catalog.json), we know the path from the root (which would have a sidecar file consisting of the absolute URL to it) to the sub-catalog/item and can produce an absolute URL easily.

HTML's rev attribute appears to describe this pattern (but has been dropped in HTML5). Potential ways to describe this could be:

{
  "rel": "root",
  "href": "../catalog.json"
},
{
  "rev": "root",
  "href": "12/catalog.json"
}

Alternately, something like (from doesn't feel quite right, but the idea is that it would be an additional attribute within the link):

{
  "rel": "root",
  "href": "../catalog.json",
  "from": "12/catalog.json"
}
cholmes commented 5 years ago

414 closes this, with some additional color on it coming in #428