Open yorugac opened 2 years ago
Just thought I'd offer some background as we (read: I) opted for using the HTTP Rest API over going for direct pod invocation deliberately.
While using k6 resume
would work, it would require higher privileges for the starter pod, which I'm not really sure is an excellent idea. Allowing the starter pod to execute HTTP requests is by far less likely to cause issues than giving a pod the privileges to execute arbitrary shell commands in the load generators.
Hi @simskij, thank you for providing additional background and input :slightly_smiling_face:
I don't think I'm following here: what do you mean by higher privileges? Internally, k6 resume
is nothing but a wrapper around REST call.
My bad! I was interpreting your intention as calling k6 resume
inside the runners using the programmatic equivalent to kubectl exec
, which would then have required the starter to have execution privileges in the runner pods.
Yes, that'd be quite a different scenario :smile: Thanks for clarifying :+1: I'll update description a bit to be more precise.
This might not be fully viable after all :disappointed: given cases like https://github.com/grafana/k6-operator/pull/89
@yorugac @simskij
While I totally understand and appreciate the synchronization to start all k6 instances (parallel
>1) I am wondering why the operator does not simply issue those HTTP calls ([https://github.com/grafana/k6-operator/blob/40dce3ffc13c55fb665c41b891c1459d3ad66068/controllers/k6_start.go#L93
->](https://github.com/grafana/k6-operator/blob/40dce3ffc13c55fb665c41b891c1459d3ad66068/controllers/k6_start.go#L103 -> https://github.com/grafana/k6-operator/blob/40dce3ffc13c55fb665c41b891c1459d3ad66068/pkg/resources/containers/curl_start.go#L29) to the REST API endpoints natively? It just seems rather complex to spin up another job / pod to then just issue a few HTTP calls via curl.
Especially if you compare this to the setup phase (https://github.com/grafana/k6-operator/blob/40dce3ffc13c55fb665c41b891c1459d3ad66068/controllers/k6_start.go#L96 -> https://github.com/grafana/k6-operator/blob/main/controllers/common.go#L231 -> https://github.com/grafana/k6-operator/blob/40dce3ffc13c55fb665c41b891c1459d3ad66068/pkg/testrun/k6client.go#L17
which is already using the golang k6 client. Aligning this would also follow the idea of this issue to make the communication with K6 a little less freestyle.
The also some closer tracking of health and test progress of k6 pods coming up would likely a thing that could be added once the k6 REST API is used.
I am wondering why the operator does not simply issue those HTTP calls to the REST API endpoints natively? It just seems rather complex to spin up another job / pod to then just issue a few HTTP calls via curl.
Given the discussion above, I believe this is what this issue is basically about at this point :smile: I should update the description to reflect that.
This issue wasn't prioritized so far simply because nobody seemed to care, until recently. So thank you greatly for sharing your opinion @frittentheke!
But also, do any of existing users actually depend on starter
pod? One can specify starter
in TestRun
so if anyone does that, changing this behaviour now would mean breaking existing workflows.
If anyone uses .spec.starter
in TestRun
, please comment here!
For completeness sake, on this comment above. Currently I don't think this will be an issue as no other such cases came up, AFAIR... And the part of "waiting for services to be up" does not need to change. Only the logic after that will be impacted.
Updated description of the issue to be more up-to-date.
cc @ivanape I believe this issue is related to what we discussed before :slightly_smiling_face:
Problem
The starter is currently relying on REST API of k6 to start the runners. It is a workable solution that allows to synchronize the start of the test. But it requires to schedule a separate pod which might be time-consuming and / or ineffective depending on the setup and test run. Additionally, it is a limiting approach on very large tests.
Suggested solution
The starting functionality is built-in into k6 itself, even if just for 1-instance test. It is used in
k6 resume
CLI command of externally-controlled executor. So k6-operator could re-use existing k6 code base and implement a synchronized start of the test directly in theTestRun
controller. To achieve synchronization, the implementation should rely on Golang routines.This solution allows to resolve all the issue described above. But there are cons as well:
.spec.starter
option as unnecessary so it's potentially breaking. Request: if you rely on.spec.starter
in yourTestRun
's, please let us know!Additional context
Additional pros:
TestRun
logic which might fit well with https://github.com/grafana/k6-operator/issues/398TestRun
logic closer to what might one day appear as distributed mode in k6. https://github.com/grafana/k6-operator/issues/199k6 issue about REST API to keep an eye on: https://github.com/grafana/k6/issues/3497