Open Phil1602 opened 1 month ago
This loosely relates to https://github.com/grafana/k6-operator/pull/401 which is about treating initializer errors as error state of the whole TestRun CR.
Hi @Phil1602, as mentioned by @frittentheke, this indeed has been raised and fixed: could you please update k6-operator to the latest version and try again? Thanks!
Also, in general, it is recommended to debug k6 scripts locally before deploying the TestRun
:slightly_smiling_face:
Hi @yorugac,
We are using k6 verification within our pipeline as a step before creating the TestRun in the meantime. Anyways, IMO it would have been still an issue, if a wrong TestRun
is not reported as such.
I will try out the latest release v0.0.16 and verify your assumptions! Thanks for the hints!
I will try out the latest release v0.0.16 and verify your assumptions! Thanks for the hints!
@yorugac while https://github.com/grafana/k6-operator/pull/401 does indeed treat an error of the Initializer Pod as error of the TestRun
CR (https://github.com/grafana/k6-operator/blob/d9490ded7c3e0cf615e2e9d41e82a842fdae7ac8/controllers/common.go#L59).
The cause of the issue @Phil1602 reported here is with the exit code (leading to the Pod actually failing
) though. If you look at https://github.com/grafana/k6-operator/blob/d9490ded7c3e0cf615e2e9d41e82a842fdae7ac8/pkg/resources/jobs/initializer.go#L79 you'll notice that here are multiple commands chained and piped together. While &&
causes the first command with non-zero exit code to fail (and that code be returned) the second part applying the grep
will then actually mask the k6 inspect
(the most important bit of this command) - https://github.com/grafana/k6-operator/blob/d9490ded7c3e0cf615e2e9d41e82a842fdae7ac8/pkg/resources/jobs/initializer.go#L79C124-L79C167
I went through the initializer logic some more and just pushed PR https://github.com/grafana/k6-operator/pull/450.
I know this changes a little more than just fixing this issue here. But I strongly believe reducing the interface width (exit code + termination message) allows the Initalizer
to really strive and be much more flexible than it is how.
I as a user can then run any image and any (list of) command and the only thing I have to ensure is that a non-zero exit code is used if there is an issue with the test.
I pushed a bugfix PR in https://github.com/grafana/k6-operator/issues/453, just fixing the issue reported by @Phil1602
^^ @yorugac
Brief summary
We realized, that our TestRuns get stuck without any information/logs printed out if the script itself is incorrect.
Cause
We already had a deeper look and realized that this is likely related to the Log message parsing of the
k6 inspect
executed within the initalizer here: https://github.com/grafana/k6-operator/blob/f75facb321d3c8ca55bbd9ba2f1895173d10bbc7/pkg/resources/jobs/initializer.go#L79When we execute
k6 inspect
manually inside the container, we get the following error:The log parsing mentioned above, basically does a
| grep 'level=error
, which does not work for the error message we are facing since the log format seems to be different.Might be related to: https://github.com/grafana/k6-docs/issues/877
k6-operator version or image
ghcr.io/grafana/k6-operator:controller-v0.0.14
Helm chart version (if applicable)
No response
TestRun / PrivateLoadZone YAML
Since we built a custom k6 image to include the script to be used as localfile, it would need some additional effort to make this available.
However, IMO this is not really related to a specific TestRun.
Other environment details (if applicable)
k6 version: k6 v0.51.0 (go1.22.4, linux/amd64)
Steps to reproduce the problem
Expected behaviour
k6 inspect
Actual behaviour
initilization
phaseLogs of k6-operator
``` 2024-07-30T09:11:00Z ERROR controllers.TestRun unable to marshal: `` {"namespace": "loadtesting", "name": "