Open fabianburth opened 1 month ago
Just for my understanding:
I suggest we decide not to support cross-consumption at all (at least for now). In that case, we can also omit the hint from the maven spec!
This would mean, that I can't download from a github repository and upload it for an OCI repository as a tarred up content, right?
Suggestion 1
I would even go as far as renaming it to something that provides a better context by name. Something like uploaderInfo
or context
or description
or what about metadata
that is generally used for additional information style data?
hint
in my mind is something that would just textually be displayed in some manner providing description of some source. Not an actual instruction to be used by the uploader.
Otherwise, I'm in full support of this 👍 Nice writeup!
This would mean, that I can't download from a github repository and upload it for an OCI repository as a tarred up content, right?
Right!
I would even go as far as renaming it to something that provides a better context by name. Something like uploaderInfo or context or description or what about metadata that is generally used for additional information style data?
Yeah, I agree - if we were to go through with suggestion 1, we should also rename it. There is already a field for general additional information style data, the labels
. I also thought about whether it would make sense to implement essentially what is suggestion 1 as label. But - as also stated above - since the transport process is such an integral part of the ocm, I thought uploaders should have a dedicated field within the spec.
Yah, I definitely agree to not have it as yet another label that could easily be missed.
Note:
Scenario 2 Is already working. However, in addition, we will allow the media type to be overwritten by the MediaType of the AccessSpec if it's set. Otherwise, the Mime type will still be used like it is now:
crane manifest ghcr.io/skarlso/maven-test-3/component-descriptors/ocm.software/demo/test-2:1.0.2
{"schemaVersion":2,"mediaType":"application/vnd.oci.image.manifest.v1+json","config":{"mediaType":"application/vnd.ocm.software.component.config.v1+json","digest":"sha256:5c2e73f0ece5566b7280af541eb0f752d4978165682f6bcd41aa59460fb148e4","size":201},"layers":[{"mediaType":"application/vnd.ocm.software.component-descriptor.v2+yaml+tar","digest":"sha256:06ecd47db00216f6ce73d34665dcab631b83be8b17d4c151c6d88c0d6b623b29","size":2560},{"mediaType":"application/x-tgz","digest":"sha256:7a9cdf674fc1703d6382f5f330b3d110ea1b512b51f1652846d9e4e8a588d766","size":9102945,"annotations":{"ocm-artifact":"[{\"kind\":\"resource\",\"identity\":{\"name\":\"mavengav\"}}]"}}]}
In the above, MediaType is set to "mediaType":"application/x-tgz"
which is set by MimeType OCM figuring it out based on the extension of the file.
Add
hint
to the artifact specificationUploaders
The ocm library has a concept of uploaders (also called blobhandlers) within the ocm library. These uploaders essentially provide the functionality to upload a blob described as an artifact (thus, as a source or resource) as part of a component to a technology specific storage.
The uploaders are an integral part of the ocm transport process. During a ocm transfer, uploaders can be configured to be called to upload artifacts to a technology specific repository.
Mechanism
The mechanism behind the uploaders is explained by answering the following questions.
How does the ocm decide which uploader(s) are called for each particular artifact?
There is a registry of uploaders where the technology specific uploaders can be registered to be called for the set of (or a subset of) the following properties: artifact type, mime type, and implementation repository type.
The implementation repository type describes the type of repository technology based on which the ocm repository is implemented (also referred to as storage backend mapping in the ocm spec). The most common type of repository technology are OCI registries.
The implementation repository type allows for the implementation of default uploaders for ocm repository types. For example, if the implementation repository type is oci, an oci uploader attempts to upload all artifacts of artifact type
ociArtifact
as individual oci artifacts (without this uploader, the artifacts would only be available as a blob which is described by a layer of the oci artifact representing the respective ocm component).How does each uploader know where to upload the artifact to? To upload the blob, the uploaders get the blob itself, the artifact type, mime type, implementation repository type (so, the information it may be registered for), and a hint. The former information is supposed to be used by the uploader to ensure that the artifact is suitable to be uploaded to the corresponding technology specific repository.
The hint, however, is supposed to contain any further information that might be needed by the uploader to correctly upload the blob.
Example
Assume, we configured exactly one uploader with the following configuration:
This registers an OCI uploader for artifact type
ociArtifact
. As described above, consequently, the OCI uploader is called during transfer for this artifact (and all other artifacts of typeociArtifact
in our hypothetical component). As a result, theimage.tar.gz
file in the maven package would be uploaded tohttps://ghcr.io/open-component-model/maven/ocm.software/ocm-cli:v1.0.0
.ISSUE 1:
The current oci uploader would attempt to check the media type of the blob described by the maven access. Since a GAVCE can always match with multiple files (e.g. based on the above example, if the maven package contains a file
ocm-cli-image.tar.gz
and another fileocm-website-image.tar.gz
), the maven access method currently returns all blobs astar.gz
with the corresponding media typeapplication/x-tar
. If the GAVCE contains multiple files, it is necessary to create a tar-archive. But if our intention was to specify a particular file with the GAVCE - as is in our above example, where we want to specify a particular oci artifact - it is rather inconvenient that it is tar'ed, since the current implementation of the oci upload handler cannot deal with a tar.gz.tar.gz file. Of course, we could provide a special upload handler that knows how to do this, but this is inconvenient.If we assume, we resolved above problem and return single files exactly as they are, we would still have to know the media type of the file specified by the GAVCE. Thus, an additional property
mediaType
is needed within the maven input and access spec.ISSUE 2: Currently, the
hint
is provided by the access method. That is because the current access method might be able to provide ahint
. For example, if you have a artifact with anociArtifact
access type. If you transfer the component including the resources to another registry without having uploaders registered, the access type will be changed tolocalBlob
.So, it will be converted from:
to:
Thereby, the
ociArtifact
access method provides thehint
which is then stored in thereferenceName
of thelocalBlob
access spec to be able to upload the artifact to a similiar location in another registry, if the blob would be transferred again with a oci uploader registered. Consequently, thelocalBlob
access method would then provide thereferenceName
ashint
during that upload.Suggestion 1
Since the concept of uploaders is more generic than this oci use case and since the
hint
is independent of the access specification of the artifact, I suggest that we extend the ocm specification to make thehint
an additional optional property of artifacts (thus, parallel to thetype
of an artifact).Since
hints
are specific to the type of repository the artifact should be uploaded to or rather even specific to the type of uploader (e.g. thehint
for an oci uploader likely looks different than thehint
for an npm uploader). Moreover, ahint
might even be specific to a certain uploader. Therefore, I suggest that thehint
property should have a similiar structure as labels consisting of aname (string)
,value (any)
, andversion (string)
.To preempt the question why I would not add a particular label for this
hint
instead of adjusting the ocm specification - that is because, as mentioned above, the concept of uploaders (or blobhandlers) is an integral part of the transport process.Suggestion 2
Since the concept described above is primarily necessary if we allow cross-consumption (so to download a blob from one type of repository, here maven, and upload it to a different type of repository, here oci), I suggest we decide not to support cross-consumption at all (at least for now). In that case, we can also omit the hint from the maven spec!
An optional
mediaType
property in the maven access spec and downloading particular maven files without tar.gz'ipping them would still be desirable!