Open nwithers-ecr opened 3 years ago
Hello!
We've added support for environment variables. So, once you've exposed the Datadog Agent as a service, you should be able to do something like this:
---
apiVersion: k6.io/v1alpha1
kind: K6
metadata:
name: k6-sample
spec:
parallelism: 4
arguments: --out statsd
script:
configMap:
name: k6-test
file: test.js
env:
- name: K6_STATSD_ADDR
value: <servicename>.<namespace>.svc.cluster.local
ports:
- containerPort: 8125
I can confirm dgzlopes's suggestion works. I exposed the datadog agent as a service named datadog-agent
in namespace my-namespace
and this config did the trick:
---
apiVersion: k6.io/v1alpha1
kind: K6
metadata:
name: k6-sample
spec:
parallelism: 4
arguments: --out statsd
script:
configMap:
name: k6-test
file: test.js
env:
- name: K6_STATSD_ENABLE_TAGS
value: "true"
- name: K6_STATSD_ADDR
value: datadog-agent.my-namespace.svc.cluster.local:8125
I had to include the port :8125
.
Also I had to add K6_STATSD_ENABLE_TAGS=true
to the K6 spec as indicated in the above yaml.
I must be doing something wrong on the datadog side. When I run kubectl get svc -n monitors
I do not see a service listening on 8125.
$ kubectl get sev -n monitors
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
datadog-agent-cluster-agent ClusterIP 10.100.0.72 <none> 5005/TCP 7m17s
datadog-agent-cluster-agent-admission-controller ClusterIP 10.100.111.229 <none> 443/TCP 7m17s
datadog-agent-kube-state-metrics ClusterIP 10.100.195.155 <none> 8080/TCP 7m17s
If I create a ClusterIP explicitly, It still fails to connect with the same error
kubectl expose pod --type="ClusterIP" --port 8125 --namespace monitors datadog-agent-hkwf9
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
datadog-agent-cluster-agent ClusterIP 10.100.0.72 <none> 5005/TCP 77m
datadog-agent-hkwf9 ClusterIP 10.100.203.231 <none> 8125/UDP 15m
- name: K6_STATSD_ADDR
value: 10.100.203.231:8125
OR
- name: K6_STATSD_ADDR
value: datadog-agent-hkwf9.monitors.svc.cluster.local:8125
Both fail with
time="2021-09-04T00:58:49Z" level=error msg="Couldn't flush a batch" error="write udp 127.0.0.1:41460->127.0.0.1:8125: write: connection refused" output=statsd
my full datadog-values.yaml shows that dogstatsd should definitely be listening.
---
registry: public.ecr.aws/datadog
datadog:
apiKeyExistingSecret: datadog-secret
clusterName: <ommitted>
logs:
containerCollectAll: true
dogstatsd:
port: 8125
useHostPort: true
nonLocalTraffic: true
apm:
portEnabled: true
processAgent:
processCollection: true
networkMonitoring:
enabled: true
clusterAgent:
admissionController:
enabled: true
tokenExistingSecret: "datadog-auth-token"
what's more, if I kubectl exec into the datadog agent pod. connecting via localhost fails, but on port 8126 succeeds. Which is expected since I have apm enabled.
root@datadog-agent-9699q:/# curl -I --connect-timeout 1 127.0.0.1:8125
curl: (7) Failed to connect to 127.0.0.1 port 8125: Connection refused
root@datadog-agent-9699q:/# curl -I --connect-timeout 1 127.0.0.1:8126
HTTP/1.1 404 Not Found
Content-Type: text/plain; charset=utf-8
X-Content-Type-Options: nosniff
Date: Sat, 04 Sep 2021 01:12:08 GMT
Content-Length: 19
I'm pinned to version 0.6.0 of the operator. Should I be running this on the main branch, or did i miss something obvious in the Kubernetes networking? If not, I can close this issue since it's confirmed working for others and open a ticket with datadog support.
@nwithers-ecr I had it working with the v0.6.0 operator. My guess is the problem is hiding in the datadog agent config, or perhaps in your kubernetes networking or RBAC. I would talk with Datadog, they've been helpful to me in the past. Good luck!
@nwithers-ecr I had same issue with sending results to Datadog. I spent some time and investigated reasons.
I had same error: "Couldn't flush a batch" error="write udp 127.0.0.1:41460->127.0.0.1:8125: write: connection refused"
, which means that datadog-agent tries to send data to localhost
(default address), this also means, that environment variables are not making any effect:
---
apiVersion: k6.io/v1alpha1
kind: K6
metadata:
name: k6-sample
....
env:
- name: K6_STATSD_ADDR
value: datadog-agent.my-namespace.svc.cluster.local:8125 # <---- THIS VAR IS NOT DELIVERED TO CONTAINER
I double checked this by entering k6-sample
container and printed all vars (printenv
)
And the reason for that is: environment variables pull request was closed. But instead latest version of k6-operator
has possibility to override runner.
So here is working solution: Datadog agent deployment:
# datadog-agent-deployment.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: datadog-agent-deployment
spec:
replicas: 1
selector:
matchLabels:
component: datadog-agent
template:
metadata:
labels:
component: datadog-agent
spec:
containers:
- name: datadog-agent
image: datadog/agent:latest
ports:
- containerPort: 8125
env:
- name: DD_SITE
value: datadoghq.eu
- name: DD_API_KEY
value: <YOUR_DATADOG_API_KEY> # BUT!!! better way is to create k8s secret with key and use envFrom.secretRef
- name: DD_DOGSTATSD_NON_LOCAL_TRAFFIC
value: "1"
Datadog agent cluster ip service:
# datadog-agent-cluster-ip-service.yml
apiVersion: v1
kind: Service
metadata:
name: datadog-agent-cluster-ip-service
spec:
type: ClusterIP
selector:
component: datadog-agent
ports:
- targetPort: 8125
protocol: UDP
port: 8125
K6 resource:
apiVersion: k6.io/v1alpha1
kind: K6
metadata:
name: k6-sample
spec:
parallelism: 4
arguments: --out statsd --verbose
script:
configMap:
name: crocodile-stress-test
file: performance.js
scuttle:
enabled: "false"
runner: # <=== HERE
image: loadimpact/k6:latest
env: # <=== env is part of runner spec
- name: K6_STATSD_ENABLE_TAGS
value: "true"
- name: K6_STATSD_ADDR
value: datadog-agent-cluster-ip-service:8125
Also I want to point one more important note and the problem I faced:
Datadog agent doesn't aggregate metrics from few jobs, so in Datadog dashboard I'm getting metrics from one of k6-sample runner
(maybe I'm doing something wrong). In my script I have 20 VUs, parallelism is 4 and is Datadog I'm getting 5 max VUs, which means that Datadog agent is not combining data from all k6 runners:
@mpanchuk Glad you have a solution! Thanks for sharing it, would you be willing to submit a pr with a new readme on the integration?
And the aggregation part is definitely thing being thought about in a broader context because it's a problem everywhere.
@KnechtionsCoding yes, I would create PR with updated readme.
I'll close this now, since, from what I understand, this would be better resolved with a PR to https://github.com/grafana/k6-docs?
@na-- unless all the operator documentation has been moved over there, no. This needs to live with the k6-operator documentation, because it is specific to the k6-operator/K8s.
My mistake, you are completely right, sorry! :man_facepalming:
@mpanchuk Thank you for this. I applied your changes and it's working correctly.
I think this case can be documented in two ways: 1) update to k6-operator's README on how to pass an environment variable to any pod in K6, starter and runner both (this is currently absent) 2) possibly a guide on how to setup Datadog with k6-operator
Also, right now passing env
outside of runner or starter spec would result in validation error so similar cases should be easier to set up.
Here is the way we figured to do it k6 is running on a docker datadog agent is running on a docker as-well
We found out that we need to add a new step which get the ip address of the datadog docker container
and add it to the K6_STATSD_ADDR
- name: Docker Agent
env:
DD_API_KEY: ${{ secrets.DATADOG_API_KEY }}
run: |
DOCKER_CONTENT_TRUST=1 \
docker run -d --network bridge \
--name datadog \
-v /var/run/docker.sock:/var/run/docker.sock:ro \
-v /proc/:/host/proc/:ro \
-v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro \
-e DD_SITE="datadoghq.com" \
-e DD_API_KEY=$DD_API_KEY \
-e DD_TAGS="YOURTAGS" \
-e DD_DOGSTATSD_NON_LOCAL_TRAFFIC=1 \
-p 8125:8125/udp \
datadog/agent:latest
- name: Wait for Agent to run...
run: |
sleep 10
shell: bash
- name: Get Datadog IP
id: getDDIp
run: |
echo "::set-output name=ddIP::$( docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' datadog )"
- name: Run k6 test
uses: grafana/k6-action@v0.2.0
env:
K6_STATSD_ADDR: ${{steps.getDDIp.outputs.ddIP}}:8125
K6_STATSD_ENABLE_TAGS: "true"
with:
filename: packages/tests-performance/dist/alias/create/test.spec.js
flags: --out statsd
Hope it will helps you
Datadog agent doesn't aggregate metrics from few jobs, so in Datadog dashboard I'm getting metrics from one of
k6-sample runner
(maybe I'm doing something wrong).
I faced the same issue. Any chance you figure this one out?
Did you manage to use @http.url
on the custom dashboard maybe?
I know that for k6 integration there is only test_run_id
tag, which I can pass over arguments CR setup for runner like --tag test_run_id=<value>
, but I want to make custom dashboard e.g. metrics by endpoint
and based on that dropdown search over @http.url
to dynamically show this tests. Is this somehow possible to handle?
When I'm testing k6 locally through docker-compose, I am able to see the results being populated in the datadog web dashboard. However, I'm struggling to convert this behavior to the k8s operator. Below is the configuration I've got so far, with datadog deployed as a helm chart in a namespace called
monitors
and the k6 operator deployed at version0.0.6
docker-compose.yml
datadog-values.yaml
resource.yml
configmap.yml
Current output of kubectl apply -f resource.yml
I believe what I'm missing is either the K6_STATSD_ADDR or the DD_AGENT_HOST environment variables (or both) which can be set with the below code. However I'm not certain how to add these env vars to the
k6-sample
pods.Any ideas or helpful advice on how I can accomplish this?