grafana / k6-jslib-aws

Javascript Library allowing to interact with AWS resources from k6 scripts
Apache License 2.0
18 stars 29 forks source link

S3Client returns null when `K6_DISCARD_RESPONSE_BODIES` is set to true #45

Open oleiade opened 1 year ago

oleiade commented 1 year ago

A user recently reported the following issue in the k6 open-source support forum:

K6 S3Client returns null with K6_DISCARD_RESPONSE_BODIES=true. Use case: I don’t want to discard the body of the S3 client response to create test data, but for real load test responses, just checking the HTTP status is sufficient.

We should run an investigation on this, and figure out if this is the intended behavior, or if we need to adjust the behavior of the jslib.

bendennis commented 1 year ago

Thanks for surfacing this issue on GitHub! I ran into this problem and it caused me to spin for quite a while thinking that there was either a bug with the client or a problem with my AWS config.

I added the following code to my setup() method after confirming that the config seemed reasonable:

  const config = AWSConfig.fromEnvironment();
  const client = new S3Client(config);

  const buckets = client.listBuckets();
  console.log(buckets);

This ran without any issue, and resulted in the following log in the k6 output:

INFO[0000] []

I also tested the getObject() method, and confirmed that no data is returned when discardResponseBodies is true. From my perspective this definitely feels like a defect, since I need to pull data from s3 that will be consumed by VUs as my test is executed.

i02302-stb commented 1 year ago

Thank you for creating the issue. Report a code example where the problem occurs.

Things I need to prepare to reproduce:

data.json This is the test data stored in S3. actually larger size. ``` { "requests": [ { "query": "/hoge" }, { "query": "/fuga" } ] } ```
script.js I set this in a ConfigMap in Kubernetes. Writing the whole ConfigMap feels redundant and informational, so I'll only share the script.js. ``` import http from 'k6/http'; import { AWSConfig, S3Client } from 'https://jslib.k6.io/aws/0.7.1/s3.js'; const awsConfig = new AWSConfig({ region: `${__ENV.AWS_REGION}`, accessKeyId: `${__ENV.AWS_ACCESS_KEY_ID}`, secretAccessKey: `${__ENV.AWS_SECRET_ACCESS_KEY}`, }); const s3 = new S3Client(awsConfig); export let options = { stages: [ { target: `${__ENV.TEST_TARGET}`, duration: `${__ENV.TEST_DURATION}` }, ] }; export function setup() { const object = s3.getObject(`${__ENV.S3_BUCKET_NAME_TEST_DATA}`, `${__ENV.S3_KEY_TEST_DATA}`); console.log(object); // I'm logging here because I want to clarify the problem. return { requests: JSON.parse(object.data).requests }; } export default function (data) { const index = Math.floor(Math.random() * data.requests.length) const base_url = `${__ENV.BASE_URL}` const url = base_url + data.requests[index].query const result = http.get(url); }; ```
k6.yaml This may not be necessary for some people. I happened to be using K6 Operator. I'm just posting here with the intention of sharing how the environment variables are set in my repro environment. ``` --- apiVersion: k6.io/v1alpha1 kind: K6 metadata: name: api-test namespace: k6-operator-system spec: parallelism: 1 arguments: --out statsd --tag test_run_id=${run_test_id} --include-system-env-vars script: configMap: name: ${configmap_name} file: script.js # cleanup: "post" runner: nodeselector: hoge/node-type: default image: grafana/k6:0.44.0 env: - name: K6_STATSD_ENABLE_TAGS value: "true" - name: K6_STATSD_ADDR value: datadog.kube-system:8125 - name: AWS_REGION value: "${region}" - name: S3_BUCKET_NAME_TEST_DATA value: "${s3_bucket}" - name: S3_KEY_TEST_DATA value: ${s3_key} - name: TEST_TARGET value: "10" - name: TEST_DURATION value: "300s" - name: BASE_URL value: "https://some-site.com" - name: AWS_ACCESS_KEY_ID valueFrom: secretKeyRef: name: ${secret_name} key: AWS_ACCESS_KEY_ID - name: AWS_SECRET_ACCESS_KEY valueFrom: secretKeyRef: name: ${secret_name} key: AWS_SECRET_ACCESS_KEY - name: K6_DISCARD_RESPONSE_BODIES value: "true" ```

result:

K6 console.log ``` time="2023-05-10T07:28:46Z" level=info msg="{\"key\":\"data.json\",\"lastModified\":null,\"size\":40017,\"storageClass\":\"STANDARD\",\"data\":null}" source=console █ setup data_received..................: 47 kB 546 kB/s data_sent......................: 971 B 11 kB/s http_req_blocked...............: avg=16.01ms min=16.01ms med=16.01ms max=16.01ms p(90)=16.01ms p(95)=16.01ms http_req_connecting............: avg=1.64ms min=1.64ms med=1.64ms max=1.64ms p(90)=1.64ms p(95)=1.64ms http_req_duration..............: avg=67.44ms min=67.44ms med=67.44ms max=67.44ms p(90)=67.44ms p(95)=67.44ms { expected_response:true }...: avg=67.44ms min=67.44ms med=67.44ms max=67.44ms p(90)=67.44ms p(95)=67.44ms http_req_failed................: 0.00% ✓ 0 ✗ 1 http_req_receiving.............: avg=1.8ms min=1.8ms med=1.8ms max=1.8ms p(90)=1.8ms p(95)=1.8ms http_req_sending...............: avg=37.23µs min=37.23µs med=37.23µs max=37.23µs p(90)=37.23µs p(95)=37.23µs http_req_tls_handshaking.......: avg=7.49ms min=7.49ms med=7.49ms max=7.49ms p(90)=7.49ms p(95)=7.49ms http_req_waiting...............: avg=65.6ms min=65.6ms med=65.6ms max=65.6ms p(90)=65.6ms p(95)=65.6ms http_reqs......................: 1 11.689142/s iteration_duration.............: avg=84.37ms min=84.37ms med=84.37ms max=84.37ms p(90)=84.37ms p(95)=84.37ms vus............................: 0 min=0 max=0 vus_max........................: 10 min=10 max=10 time="2023-05-10T07:28:47Z" level=error msg="TypeError: Cannot read property 'requests' of undefined\n\tat setup (file:///test/script.js:21:45(24))\n" hint="script exception" ```

If I remove K6_DISCARD_RESPONSE_BODIES=true and run it, the data will be correctly retrieved from S3.

If I am missing any information, please let me know.

Thanks.

oleiade commented 1 year ago

Thanks a lot for you input 🙇🏻

We have investigated the issue and found a workaround: to ignore the discardResponseBodies option in the context of our client classes, we need to make sure that whenever we use http.request, we pass the responseType: 'text' | 'binary' option as a parameter.

We're looking into the best ways to integrate that change in the library, and a fix should land with version 0.8.0 👍🏻

oleiade commented 1 year ago

Gave this a stab, and couldn't get to a satisfying solution yet. Moving this to 0.9.0.

KOConchobhair commented 2 months ago

from https://github.com/grafana/k6-jslib-aws/issues/105#issuecomment-2173268525

For context, I believe this is connected to https://github.com/grafana/k6-jslib-aws/issues/45; for which I unfortunately didn't find a satisfying solution at the time.

This one seemed straight forward to me. If i am calling listBuckets() in a k6 script, there is never a case where I want to ignore the response body (since im calling it) and that API can only return a string so we simply need to hardcode responseType: 'text' in that case. and each S3 API would be a case by case basis so I don't think you need a generic solution here.

oleiade commented 4 weeks ago

We have opened a PR #114 in an attempt to address this issue. If any of you folks have the capacity to take it for a spin, and let us know if it works for you, we would very much appreciate it 🙇🏻

To do so, simply checkout the PR, run npm install && npm run webpack, and go ahead with a script reproducing the issue. Here's mine for instance:

import http from 'k6/http'
import { AWSConfig, S3Client } from '../dist/s3.js'

const awsConfig = new AWSConfig({
    region: __ENV.AWS_REGION,
    accessKeyId: __ENV.AWS_ACCESS_KEY_ID,
    secretAccessKey: __ENV.AWS_SECRET_ACCESS_KEY,
    sessionToken: __ENV.AWS_SESSION_TOKEN,
})

const s3 = new S3Client(awsConfig)

export default async function () {
    // List the buckets the AWS authentication configuration
    // gives us access to.
    const buckets = await s3.listBuckets()

    console.log(JSON.stringify(buckets))

    const responseWithBody = http.get('https://httpbin.org/get')
    console.log(responseWithBody.body)
}