kubeshop / testkube

☸️ Kubernetes-native testing framework for test execution and orchestration
https://testkube.io
Other
1.32k stars 129 forks source link

retry count exceeeded, there are no active pods with given id=xecutions3476f5634280701 #2851

Open tkonieczny opened 1 year ago

tkonieczny commented 1 year ago

Describe the bug

⨯ retry count exceeeded, there are no active pods with given id=xecutions3476f5634280701
Use following command to get test execution details:
$ kubectl testkube get execution 638dfd96e3476f5634280701

error appeared while trying to run the Postman sanity tests. I have no idea how to reproduce it - I'm adding this issue to track if it will ever reappear.

Workflow that failed: https://github.com/kubeshop/helm-charts/blob/da69a3f9ba4e5379fa718a3d9ab230fd65a59cb8/.github/workflows/helm-releaser-testkube-charts.yaml#L283

Btw. typo in exceeeded.

vsukhin commented 1 year ago

Sounds like itis from this method func NewWatchExecutionCmd() *cobra.Command {

tkonieczny commented 1 year ago

Looks like this issue reappeared at Demo:

{"type":"error","content":"retry count exceeeded, there are no active pods with given id=63ea31c28be7fbf0e1b6987f","time":"2023-02-13T13:21:44.664242595Z"}

https://demo.testkube.io/tests/executions/container-executor-curl-smoke-negative/execution/63ea31c28be7fbf0e1b6987f

Additionally, this test is marked as running in dashboard (and as started in CLI):

ID:         63ea31c28be7fbf0e1b6987f
Name:       executor-container-smoke-tests-container-executor-curl-smoke-negative-118
Number:            118
Test name:         container-executor-curl-smoke-negative
Type:              container-executor-curl/test
Status:            running
Start time:        2023-02-13 12:49:06.707 +0000 UTC
End time:          0001-01-01 00:00:00 +0000 UTC
Duration:          
Labels:            core-tests=executors-negative

  Variables:    1
  - URL = https://testkube.non.existing.url.example
Args:     $(URL)

Test execution started
neomusic commented 11 months ago

Any update of this? We have the same issue

vsukhin commented 11 months ago

hey, @neomusic

  1. We're planning to improve our architecture in a next couple of months and move a test scheduler into a separate service to become more fault tolerant and tracking carefully for test pod state. It should improve Testkube observability and error processing
  2. Menawhile for Testkube 1.15 (planned for next week) this enhacement was done https://github.com/kubeshop/testkube/pull/4387 . We should persist more details when pod fails or terminated abnormally