camunda / camunda-platform-helm

Camunda Platform 8 Self-Managed Helm charts
https://docs.camunda.io/docs/self-managed/overview/
Apache License 2.0
74 stars 138 forks source link

[ISSUE] Operate and tasklist fail to connect to zeebe (requesting partition ids) when OIDC is enabled on 8.6-alpha5 #2385

Closed jessesimpson36 closed 1 month ago

jessesimpson36 commented 1 month ago

Describe the issue:

When installing from the 8.6 alpha subdirectory with a configuration that has OIDC enabled, Operate and Tasklist are not able to connect to zeebe.

In this environment, all the applications are Ready except for operate / tasklist:

2024-09-26-142720_grim

Inside the log file for operate, we get

      io.camunda.operate.zeebe.PartitionHolder - Error occurred when requesting partition ids from Zeebe client: null
io.camunda.zeebe.client.api.command.ClientStatusException: null
    at io.camunda.zeebe.client.impl.ZeebeClientFutureImpl.transformExecutionException(ZeebeClientFutureImpl.java:116) ~[zeebe-client-java-8.6.0-alpha5.jar:8.6.0-alpha5]
    at io.camunda.zeebe.client.impl.ZeebeClientFutureImpl.join(ZeebeClientFutureImpl.java:54) ~[zeebe-client-java-8.6.0-alpha5.jar:8.6.0-alpha5]

Which indicates that when operate connects to zeebe, it can't call the API.

In the zeebe, the error we see is

Caused by: io.atomix.cluster.messaging.MessagingException$NoRemoteHandler: No remote message handler registered for this message, subject cluster-topology-sync
    ... 22 more

So I think the api operate is requesting from zeebe is cluster-topology-sync. For further testing, we could make use of this project to test the endpoints: https://github.com/camunda-community-hub/camunda-8-examples/blob/main/zeebe-client-plain-java/src/main/java/io/camunda/zeebe/example/cluster/TopologyViewer.java

I've also tested this same values.yaml against the latest 8.5 release, and everything came up healthy.

I've also tested against 8.6.0-rc1, and the issue still occurs in rc1.

Actual behavior:

Expected behavior:

How to reproduce:

Logs:

operate-logs.txt tasklist.txt zeebe-gateway-logs.txt zeebe-logs.txt

Environment:

Please note: Without the following info, it's hard to resolve the issue and probably it will be closed.

jessesimpson36 commented 1 month ago

https://github.com/camunda/distribution/issues/306

jessesimpson36 commented 1 month ago

Right now, I'm thinking this might be an issue with the zeebe gateway. When I try to authenticate and request from it's /v2/topology restapi endpoint, I get the following error:

 curl --request POST ${ZEEBE_AUTHORIZATION_SERVER_URL} \
     --header 'Content-Type: application/x-www-form-urlencoded' \
     --data-urlencode 'grant_type=client_credentials' \
     --data-urlencode "audience=${ZEEBE_TOKEN_AUDIENCE}" \
     --data-urlencode "client_id=${ZEEBE_CLIENT_ID}" \
     --data-urlencode "client_secret=${ZEEBE_CLIENT_SECRET}" \
     --data-urlencode "scope=${ZEEBE_TOKEN_SCOPE}"
{
  "token_type": "Bearer",
  "expires_in": 3599,
  "ext_expires_in": 3599,
  "access_token": "MYBEARERTOKEN"
}
curl -L -X GET \
      -H 'Accept: application/json' \
      -H 'Authorization: Bearer MYBEARERTOKEN' \
      'http://localhost:8080/v2/topology'
{
  "type": "about:blank",
  "title": "Unauthorized",
  "status": 401,
  "detail": "URI with undefined scheme",
  "instance": "/v2/topology"
}

Which also matches the error in the zeebe gateway logs:

java.lang.IllegalStateException: java.lang.IllegalArgumentException: URI with undefined scheme
jessesimpson36 commented 1 month ago

I was able to resolve this issue by supplying the following variable in the zeebe-gateway configmap:

    camunda:
      identity:
        baseUrl: http://cpt-identity:80
jessesimpson36 commented 1 month ago

2024-09-27-163208_grim

jessesimpson36 commented 1 month ago

Since this https://github.com/camunda/camunda-platform-helm/pull/2389 is merged, we can now close this issue