Open slolatte opened 2 weeks ago
I think this is because the connector pod sometimes fails to start. I normally see this type of error when the connector pod hangs:
2024-09-16T09:45:51.506Z WARN 1 --- [pool-2-thread-9] i.c.z.client.impl.ZeebeCallCredentials : The request's security level does not guarantee that the credentials will be confidential.
The integration test is not flaky, it's a reported issue in the Connectors where this issue shows in the logs:
2024-09-27T08:23:09.840Z WARN 1 --- [lt-executor-141] io.camunda.zeebe.client.job.poller : Failed to activate jobs for worker HTTP REST and job type io.camunda:http-json:1
io.grpc.StatusRuntimeException: CANCELLED
at io.grpc.Status.asRuntimeException(Status.java:533)
at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:481)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:564)
at io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:72)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:729)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:710)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.io.IOException: Failed while requesting access token with status code 401 and message Unauthorized.
at io.camunda.zeebe.client.impl.oauth.OAuthCredentialsProvider.fetchCredentials(OAuthCredentialsProvider.java:157)
at io.camunda.zeebe.client.impl.oauth.OAuthCredentialsCache.computeIfMissingOrInvalid(OAuthCredentialsCache.java:100)
at io.camunda.zeebe.client.impl.oauth.OAuthCredentialsProvider.applyCredentials(OAuthCredentialsProvider.java:79)
at io.camunda.zeebe.client.impl.ZeebeCallCredentials.lambda$applyRequestMetadata$0(ZeebeCallCredentials.java:49)
... 3 common frames omitted
The issue is not just for that Connectors worker; there are many others in the logs with the same error. Restarting the Pod fixes the issue, which makes it more likely related to the app retry logic.
I've disabled the Connectors test for 8.6 chart until the bug is fixed: https://github.com/camunda/camunda-platform-helm/commit/5784bc56fd6162269090f6fea018e142a2c15c9d
Describe the issue:
I've noticed that the integration test 'TEST-Check-Connectors-webhook' is being quite flaky, often causing the helm chart setup to fail when triggered via GHA. A workflow rerun usually fixes this issue.
Actual behavior:
An example of this can be seen in this test run here.
Expected behavior:
I would expect the integration to be more robust and suggest it be refactored.
How to reproduce:
Context - please see Slack thread.
Logs:
Environment:
Please note: Without the following info, it's hard to resolve the issue and probably it will be closed.