spring-cloud / spring-cloud-dataflow

A microservices-based Streaming and Batch data processing in Cloud Foundry and Kubernetes
https://dataflow.spring.io
Apache License 2.0
1.11k stars 581 forks source link

Failure to get Application properties for images with OCI manifest #5819

Closed vicSenior closed 4 months ago

vicSenior commented 4 months ago

Description:

My purpose is to create SCDF Streams of Applications that make use of customized Docker images. All custom Docker images have been customized with application properties defined in the json org.springframework.cloud.dataflow.spring-configuration-metadata.json, declared with LABEL in DockerFile.

So, suppose that a Stream of Applications was declared as http | my-custom-app | log pipe. In detail:

The problem arises when SCDF tries to collect .config information from my-custom-app manifest, because my-custom-app has been published (using Azure DevOps automation) in Quay private Registry using OCI Standard (so mediaType manifest is vnd.oci.image.manifest.v1+json).

In fact, looking at the source code, I understood that Rest http GET calls are performed using (always) the header Accept:application/vnd.docker.container.image.v1+json. That call, performed for an image published with Standard OCI, returns a manifest with SchemaVersion=1 (that is devoid of the .config object).

Follow the smoke trail:

The exception ContainerRegistryException reported by the SCDF server component is the following:

2024-05-17 16:22:30.068  WARN 1 --- [io-8080-exec-10] ApplicationConfigurationMetadataResolver : Failed to retrieve properties for resource Docker Resource [docker:my-quay-registry.local/poc-scdf/my-custom-app:0.0.1] because of ContainerRegistryException: Image [my-quay-registry.local/poc-scdf/my-custom-app:0.0.1] has incorrect or missing manifest config element: {tag=0.0.1, name=poc-scdf/my-custom-app, architecture=amd64, schemaVersion=1, history=[{v1Compatibility={"created": "2024-05-16T13:20:41.393484176Z", "architecture": "amd64", "os": "linux", "config": {"User": "default", "Env": ["container=oci", "GECOS=JBoss user", "HOME=/home/jboss", ...

which is raised on line 55 of the DefaultContainerImageMetadataResolver.java file of method getImageLabels(String imageName):

// DefaultContainerImageMetadataResolver.java
if (manifest != null && !isNotNullMap(manifest.get("config"))) {
    throw new ContainerRegistryException(
        String.format("Image [%s] has incorrect or missing manifest config element: %s", 
            imageName, manifest.toString()));
}

Let's evaluate the two conditions:

  1. the first [manifest != null] is always true (because we can see the manifest object stringified in the exception message, by .toString() method),
  2. the second [!isNotNullMap(manifest.get("config"))] is therefore the cause of the problem, with manifest.get("config") a null object.

Additionally, the Rest http GET used to recovery the informations that define the manifest object is in method getImageManifest() of file ContainerRegistryService.java, and we can see that (on line 203) were used che Header Accept with value of String imageManifestMediaType:

// ContainerRegistryService.java
public <T> T getImageManifest(ContainerRegistryRequest registryRequest, Class<T> responseClassType) {
    String imageManifestMediaType = registryRequest.getRegistryConf().getManifestMediaType();
    if (!SUPPORTED_MANIFEST_MEDIA_TYPES.contains(imageManifestMediaType)) {
        throw new ContainerRegistryException("Not supported image manifest media type:" + imageManifestMediaType);
    }
    HttpHeaders httpHeaders = new HttpHeaders(registryRequest.getAuthHttpHeaders());
    httpHeaders.set(HttpHeaders.ACCEPT, imageManifestMediaType);
}

The String imageManifestMediaType receive the value from registryRequest.getRegistryConf().getManifestMediaType() (line 198 in file ContainerRegistryRequest.java). So, registryRequest.getRegistryConf() return an object of Class ContainerRegistryConfiguration.java and method .getManifestMediaType() return the String manifestMediaType with constant value of String DOCKER_IMAGE_MANIFEST_MEDIA_TYPE declared in file ContainerRegistryProperties.java:

// ContainerRegistryConfiguration.java
public class ContainerRegistryConfiguration {
...
    /**
     * Image Manifest media type. Docker and OCI are supported.
     */
    private String manifestMediaType = ContainerRegistryProperties.DOCKER_IMAGE_MANIFEST_MEDIA_TYPE;
...
    public String getManifestMediaType() {
        return manifestMediaType;
    }

    public void setManifestMediaType(String manifestMediaType) {
        this.manifestMediaType = manifestMediaType;
    }
// ContainerRegistryProperties.java
@ConfigurationProperties(prefix = ContainerRegistryProperties.CONTAINER_IMAGE_METADATA_PREFIX)
public class ContainerRegistryProperties {
    public static final String CONTAINER_IMAGE_METADATA_PREFIX = "spring.cloud.dataflow.container";
    public static final String OCI_IMAGE_MANIFEST_MEDIA_TYPE = "application/vnd.oci.image.manifest.v1+json";
    public static final String DOCKER_IMAGE_MANIFEST_MEDIA_TYPE = "application/vnd.docker.distribution.manifest.v2+json";
    public static final String DOCKER_HUB_HOST = "registry-1.docker.io";
    public static final String DEFAULT_TAG = "latest";
    public static final String DEFAULT_OFFICIAL_REPO_NAMESPACE = "library";

In conclusion, SCDF performs the recovery of the manifest by declaring (always) the header Accept:application/vnd.docker.distribution.manifest.v2+json in the Rest http GET: this is not good for containers created with the "OCI" standard. In fact, that retrieve a manifest with SchemaVersion=1, without the .config object.

Steps to reproduce the issue:

Using CURL command, with the specific Accept: header, we can retrieve manifest of a docker image:

Command:

curl -s -L -X GET -H "Authorization: Bearer $(curl -s -L -X GET -H "Authorization: Basic $CREDENTIALS" \ "https://${URI_QUAY}/v2/auth?service=${URI_QUAY}&scope=repository:${REPOS}/${IMGNAME}:pull,push" \ | jq '.token' -r)" \ -H "Accept:${ACCEPT_HEADER}" https://${URI_QUAY}/v2/${REPOS}/${IMGNAME}/manifests/${IMGTAG}

we get a `manifest` with `SchemaVersion=1` (**bad**, because not contain `.config` object):
```javascript
{
"tag": "0.0.1",
"name": "poc-scdf/my-custom-app",
"architecture": "amd64",
"schemaVersion": 1,
"history": [
{
"v1Compatibility": "{\"created\": \"2024-05-16T13:20:41.393484176Z\", \"architecture\": \"amd64\", \"os\": \"linux\", \"config\": {\"User\": \"default\", \"Env\": [\"container=oci\", \"GECOS=JBoss user\", \"HOME=/home/jboss\", \"UID=185\", \"USER=jboss\", \"JAVA_HOME=/usr/lib/jvm/java-17\", \"JAVA_VENDOR=openjdk\", \"JAVA_VERSION=17\", \"JBOSS_CONTAINER_OPENJDK_JDK_MODULE=/opt/jboss/container/openjdk/jdk\", ...
}
}
  • Use of Header Accept:application/vnd.oci.image.manifest.v1+json":
    
    # Variables:
    URI_QUAY="my-quay-registry.local"
    CREDENTIALS="cG9jLXNjZGYr...cxSkE2Ug=="
    REPOS="poc-scdf"
    IMGNAME="my-custom-app"
    IMGTAG="0.0.1"
    ACCEPT_HEADER="application/vnd.oci.image.manifest.v1+json"

Command:

curl -s -L -X GET -H "Authorization: Bearer $(curl -s -L -X GET -H "Authorization: Basic $CREDENTIALS" \ "https://${URI_QUAY}/v2/auth?service=${URI_QUAY}&scope=repository:${REPOS}/${IMGNAME}:pull,push" \ | jq '.token' -r)" \ -H "Accept:${ACCEPT_HEADER}" https://${URI_QUAY}/v2/${REPOS}/${IMGNAME}/manifests/${IMGTAG}

we get a `manifest` with `SchemaVersion=2` (**good**, because contain `.config` object):
```javascript
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"config": {
"mediaType": "application/vnd.oci.image.config.v1+json",
"digest": "sha256:efc06d6096cc88697e477abb0b3479557e1bec688c36813383f1a8581f87d9f8",
"size": 34268
},
...
}

Focus on the structure of the image manifests that compose my Stream Applications:

As mentioned above, taht is the structure of the pipe that makes up my Stream Applications: http | my-custom-app | log. In detail:

> docker manifest inspect springcloudstream/http-source-kafka:3.2.1 | jq 'del(.layers)'
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
  "config": {
    "mediaType": "application/vnd.docker.container.image.v1+json",
    "digest": "sha256:a5ee68d9b8ab0d22786cc57efda8b2adfd595030d638d8a98dda1c70ff036307",
    "size": 5565
  }
}
> docker manifest inspect my-quay-registry.local/poc-scdf/my-custom-app:0.0.1 | jq 'del(.layers)'
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.manifest.v1+json",
  "config": {
    "mediaType": "application/vnd.oci.image.config.v1+json",
    "digest": "sha256:efc06d6096cc88697e477abb0b3479557e1bec688c36813383f1a8581f87d9f8",
    "size": 34268
  },
  "annotations": {
    "org.opencontainers.image.base.digest": "sha256:ae6bd2e0a72e7d59612a9a242899374ff7e0f358a5ff09ed9a16e325ca9b4d18",
    "org.opencontainers.image.base.name": "my-quay-registry.local/global/jdk17:latest"
  }
}
> docker manifest inspect springcloudstream/log-sink-kafka:3.2.1 | jq 'del(.layers)'
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
  "config": {
    "mediaType": "application/vnd.docker.container.image.v1+json",
    "digest": "sha256:c976405b5e2588e58855eaf19769be724ba02084b05672602bbd09d704179b90",
    "size": 4421
  }
}

Proposal for solving the problem

Modify the method getImageLabels(String imageName) (frame of code from line 53 to 57, in file DefaultContainerImageMetadataResolver.java) in order to check if manifest obtained has schemaVersion=1 or =2. if schemaVersion=1 it may be useful to make a second attempt, by changing the Accept: header of the Rest http GET (and using the OCI one already prepared with the String OCI_IMAGE_MANIFEST_MEDIA_TYPE on line 33 in file ContainerRegistryProperties.java.

Release versions:

I'm using SCDF version 2.11.2 installed in a Kubernetes environment using Helm.

> kubectl describe deployment/scdf-spring-cloud-dataflow-server -n scdf
Name:                   scdf-spring-cloud-dataflow-server
Namespace:              scdf
CreationTimestamp:      Tue, 27 Feb 2024 16:07:36 +0100
Pod Template:
  Labels:           app.kubernetes.io/component=server
                    app.kubernetes.io/instance=scdf
                    app.kubernetes.io/managed-by=Helm
                    app.kubernetes.io/name=spring-cloud-dataflow
                    app.kubernetes.io/version=2.11.2
                    helm.sh/chart=spring-cloud-dataflow-26.8.0
Containers:
   server:
    Image:      docker.io/bitnami/spring-cloud-dataflow:2.11.2-debian-12-r11
> kubectl version --output=json
{
  "serverVersion": {
    "major": "1",
    "minor": "23",
    "gitVersion": "v1.23.8+vmware.3",
    "gitCommit": "53e803785668c31b8fcb41fc3607f3fdc3c87465",
    "gitTreeState": "clean",
    "buildDate": "2022-11-03T17:16:51Z",
    "goVersion": "go1.17.11",
    "compiler": "gc",
    "platform": "linux/amd64"
  }
}
corneil commented 4 months ago

@vicSenior Thank you for getting to the bottom of this.

corneil commented 4 months ago

Have you tried adding the following to Spring Cloud Data Flow Server configuration:

spring:
  cloud:
    dataflow:
      container:
        registry-configurations:
          qauy-private:
              registry-host: quay-registry.local
              manifest-media-type: application/vnd.oci.image.manifest.v1+json
              .... other properties

The lookup of the container configuration is done by matching the registry-host to the host in the image uri.

https://dataflow.spring.io/docs/applications/application-metadata/#using-metadata-container-image-labels

vicSenior commented 4 months ago

@corneil thank you. But your proposal doesn't help: in the registry 'quay-registry.local' I have mixed images (some with manifest mediaType 'docker', others with 'oci').

corneil commented 4 months ago

@vicSenior We will look into implementing your proposal.

corneil commented 4 months ago

@vicSenior Have you checked the output of v2/{repository}/blobs/{digest} for both types of containers? Are they the same or does the accepts also affect those?

vicSenior commented 4 months ago

Hello, @corneil, I've performed tests that you has required.

Inspecting the source code, the method getImageBlob() (line 219 in file ContainerRegistryService.java) not defined the ACCEPT: header for the Rest http GET call (see tests [b] below).

Anyway, I've performed tests for all cases (I hope this is enough to dispel any doubts):

So, for all [b] tests (curl v2/{repository}/blobs/{digest} ) i've obtain a json, with same size and same md5sum sign.

If you need anything else, I'm here. Thank you!

vicSenior commented 4 months ago

I perceive that the use of the schemaVersion key is disliked (for reasons of future potential disuse).

I thought of an alternative way to get the .config object from the Manifest without using schemaVersion key.

An alternative algorithm:

Let's consider the case where SCDF needs to retrieve configurations from three images of a pipe img1 | img2 | img3 And suppose that:

  1. mediaType of the Manifest that describes each Docker image is not known,
  2. for a future perspective, there are more than two mediaTypes standard usable to define Manifests (example: OCI, Docker, Custom, Future, etc ...).

For example:

public static final String CUSTOM_IMAGE_MANIFEST_MEDIA_TYPE = "application/vnd.custom.test1.manifest.v3+json";
public static final String FUTURE_IMAGE_MANIFEST_MEDIA_TYPE = "application/vnd.future.img.manifest.v1+json";
public static final String OCI_IMAGE_MANIFEST_MEDIA_TYPE = "application/vnd.oci.image.manifest.v1+json";
public static final String DOCKER_IMAGE_MANIFEST_MEDIA_TYPE = "application/vnd.docker.distribution.manifest.v2+json";

So, if we wanted to retrieve the .config object from a Manifest without knowing the version of the schemaVersion, we could:

  1. make a first call to the URL v2/{repository}/manifests/{reference} using a header Accept: application/json (generic)
  2. read the response header, which will be ContentType: application/X-Y-Z, with X-Y-Z the specific mediaType used by the Publisher to package the Docker image Manifest in the registry.
  3. if the two headers (the Accept: of request and the ContentType: of response) are the same, we can extract the .config object, because it is present.
  4. otherwise, if the two Headers (request's Accept: and response's ContentType:) are different, we will need to make a second call to the URL v2/{repository}/manifests/{reference} using a Header Accept: with the right value obtained from the ContentType: response Header (in the example X-Y-Z). This way the second response will certainly be equipped with the .config object.

Optimization:

This algorithm requires at most two calls to retrieve the .config. object from the Manifest.

You can program the first attempt using the Docker standard as the Accept: header (application/vnd.docker.distribution.manifest.v2+json), which in most cases is sufficient to obtain the .config object on the first try.

If, however, SCDF needs to query for an image with a Manifest mediaType different from Docker (or any other standard, even to be defined in the future), a second (and final) call will be enough to retrieve .config object.

Esamples:

Example A:

so with total of 4 Rest http GET we got all the .config objects for every Images of pipe.

Example B (worst-case):

so with total of 6 Rest http GET (n*2, with n number of Images that compose the pipe) we got all the .config objects for every Images of pipe.

Some practical suggestions:

In method getImageManifest() of Class ContainerRegistryService (file [ContainerRegistryService.java]) we can use (after line 220) the manifest object, that provide the 'ContentType' response Header with instruction:

manifest.getHeaders().getContentType().toString();

with .getContentType() that return a MediaType object.

cppwfs commented 4 months ago

Resolved with PR #5823 @vicSenior Thank you for raising this issue!