stac-extensions / ml-model

An Item and Collection extension to describe machine learning (ML) models that operate on Earth observation data.
Apache License 2.0
37 stars 0 forks source link

Add scheme to Docker URIs #9

Open duckontheweb opened 2 years ago

duckontheweb commented 2 years ago

The current definition of the ml-model:inferencing-image and ml-model:training-image links requires that they be of the form <registry_domain>/<user_or_organization_name>/<image_name>:<tag> (e.g. docker.io/some_organization/example-image:1), which gives the appearance of a relative URI. This causes problems in STAC API implementations like stac-fastapi that replace relative URLs with absolute URLs prefixed with the API's root path (e.g. the above example in an API with a root of https://some-api.com/stac would become https://some-api.com/stac/docker.io/some_organization/example-image:1 in any Items containing that link).

We should consider redefining that Link type to avoid this issue. We could require that the URI be prefixed with a scheme like docker://, but we may also want to reconsider how that link is defined. Most STAC links seem to be defined as HTTP links, so it might be more appropriate to link to the homepage for an image (e.g. https://hub.docker.com/layers/radiantearth/cyclone-model-torchgeo/1/images/sha256-bf09b7cc7e088c0089c0833a4cca0b1f5b4b21a18360ad98e21325ae87c0065a?context=repo instead of docker.io/radiantearth/cyclone-model-torchgeo:1).

cc: @guidorice @m-mohr

m-mohr commented 2 years ago

Yes, the current definition doesn't make sense. I didn't realize that when reviewing. A link href should be a valid resolvable URL and nothing else.

Looking at the example, the following should probably be changed:

{
      "rel": "ml-model:inferencing-image",
      "href": "registry.hub.docker.com/my-user/my-inferencing-model:v1",
      "type": "docker-image",
      "title": "My Model (v1)"
}

to for example:

{
      "rel": "ml-model:inferencing-image",
      "href": "https://registry.hub.docker.com/my-user/my-inferencing-model",
      "docker:name": "registry.hub.docker.com/my-user/my-inferencing-model:v1",
      "type": "docker-image",
      "title": "My Model (v1)"
}

or:

{
      "rel": "ml-model:inferencing-image",
      "href": "https://registry.hub.docker.com/my-user/my-inferencing-model",
      "docker:registry": "registry.hub.docker.com",
      "docker:image": "my-user/my-inferencing-model",
      "docker:tag": "v1",
      "type": "docker-image",
      "title": "My Model (v1)"
}

or is there anything that can be used as absolute URL instead? Is docker:// commonly used? Is there a way to link to specific versions? Right now the href is not 100% ideal.

Eventually, we could also extract all this into a docker link extension, I guess.

fmigneault commented 4 months ago

@m-mohr Is there anything that was done regarding this, specifically for the docker:// (I like this idea, although not official/common, it removes ambiguity with https://) and potential docker extension? Similar issues are encountered with MLM and STAC-API mangling the URI missing the protocol, as if they were relative links. For now, the work-around is to use processing:expression (ie: https://github.com/stac-extensions/processing/pull/33, https://github.com/stac-extensions/processing/issues/31).

m-mohr commented 4 months ago

Not that I'm aware of.