apache / camel-k

Apache Camel K is a lightweight integration platform, born on Kubernetes, with serverless superpowers
https://camel.apache.org/camel-k
Apache License 2.0
868 stars 349 forks source link

Mount trait regression with release 2.5.0 #5924

Closed hernan-abi closed 1 day ago

hernan-abi commented 2 weeks ago

What happened?

Using the open-api trait succeeds in v2.4.x and fails in 2.5.0 with the same configuration/integration.

I'm using a very basic open-api spec and integration. Basically a hello-world api example for testing. While the integration is deploying there is a Knative revision error that is referenced in the integration conditions:

Condition "Ready" is "False" for Integration open-api-test: Configuration "open-api-test" is waiting for a Revision to become ready

This is probably nothing, but when I was taking a look at the knative-service for both my integrations (v2.4.0 and v2.5.0), the 2.5.0 ksvc has the open-api source referenced as the first source. I don't recall seeing this before but I'll continue looking into this unless anyone's got an idea of what's causing the regression.

Screen Shot 2024-11-05 at 17 45 55

Steps to reproduce

  1. Take any functioning example using the open-api trait in v2.4.x
  2. Apply this same configuration to a cluster installed with v2.5.0
  3. A Revision error is thrown during deployment

Relevant log output

Progress: integration "open-api-test" in phase Running
Condition "KnativeServiceAvailable" is "True" for Integration open-api-test: Knative service name is open-api-test
Condition "DeploymentAvailable" is "False" for Integration open-api-test: controller strategy: knative-service
(combined from similar events): Condition "ServiceTraitInfo" is "True" for Integration open-api-test: explicitly disabled by the platform: knative-service trait has priority over this trait
(combined from similar events): Condition "SecurityContextTraitInfo" is "True" for Integration open-api-test: explicitly disabled by the platform: pod security context is disabled for Knative Service. PodSecurityContext properties can affect non-user sidecar containers that come from Knative or your service mesh. Use container security context instead.
(combined from similar events): Condition "TraitInfo" is "True" for Integration open-api-test: Applied traits: camel,environment,logging,deployer,gc,knative-service,container,mount,pull-secret,quarkus,jvm,owner
(combined from similar events): Condition "Ready" is "False" for Integration open-api-test
Integration "open-api-test" in phase "Running"
(combined from similar events): Condition "Ready" is "False" for Integration open-api-test: The Route is still working to reflect the latest desired specification.
(combined from similar events): Condition "Ready" is "False" for Integration open-api-test: Configuration "open-api-test" is waiting for a Revision to become ready.
2024-11-05T16:11:03-05:00       ERROR   camel-k.scraper.pod     error caught during log scraping        {"name": "open-api-test-00001-deployment-75f4d6dd56-fl8gg", "error": "no state change after 30 seconds

Camel K version

v2.4.0, v2.5.0

squakez commented 2 weeks ago

Hello, thanks for reporting. The order should not matter. I will have a look later in the day to see what could be the root cause of this failure. Would you mind to run the same without Knative? Just to make sure this problem is a combination of openapi + knative and not openapi exclusively.

squakez commented 2 weeks ago

I've done some test and I wasn't able to reproduce any error. Both with plain Deployment and with KnativeService, I can run the application with openapi trait normally. A question though: what's the openapi spec version you're using? I know that in the newer Camel version (>4.4), the older swagger 2.x spec may not work. Also, you can do some other verification to confirm the problem is on the runtime and not on the operator side, trying to run the application with a previous runtime (ie, -t camel.runtime-version=3.8.0).

hernanDatgDev commented 2 weeks ago

The strange part is that there is no explicit error however the integration pod fails to actually start. It's in a perpetual "Container starting" state. I've tried with v2.5.0 with knative enabled/disabled and also with the runtime trait set to 3.8.1 with no luck. I've been taking a look at the code changes that were made to the open-api trait and haven't found any luck just yet

Screen Shot 2024-11-07 at 14 16 35
hernanDatgDev commented 2 weeks ago

Here is the basic groovy example I'm using (I originally used Java):

open-api-test.groovy:

// camel-k: trait=openapi.configmaps=basic-health-api
// camel-k: trait=knative-service.enabled=false

from('timer:heartbeat?period=5s')
    .log('heartbeat')
.to('direct:health')

from('direct:health')
    .setBody().constant('{"isHealthy":true}')
    .log('${body}')

basic-health-api configmap contents:

{
  "openapi": "3.0.2",
  "info": {
    "title": "basic-api",
    "version": "1.0"
  },
  "paths": {
    "/health": {
      "get": {
        "responses": {
          "200": {
            "content": {
              "application/json": {
                "schema": {
                  "type": "string"
                }
              }
            },
            "description": "default response description"
          }
        },
        "operationId": "health"
      }
    }
  }
}

📓 Note: The following can be included but times out in the same way manner // camel-k: trait=camel.runtime-version=3.8.1

squakez commented 2 weeks ago

Thanks for the reproducer. I'll try that out. I wonder if this specific endpoint (health) is conflicting with the normal health endpoint we're providing.

squakez commented 1 week ago

I managed to reproduce the issue. It seems to be a regression on mount trait:

Warning  FailedMount  59s (x8 over 2m3s)  kubelet            MountVolume.SetUp failed for volume "i-source-001" : configmap "api-source-001" not found

The problem is that we are referencing a source index that don't exist. We need to fix this and release a patch.

hernanDatgDev commented 1 week ago

@squakez I assumed you were handling this. Is that right? Is this something I might be able to help out on?

squakez commented 1 week ago

@squakez I assumed you were handling this. Is that right? Is this something I might be able to help out on?

Hello. Yes, I'm planning to work this during the week end if I have time. If you're in a rush feel free to pick it up though.