artilleryio / artillery

The complete load testing platform. Everything you need for production-grade load tests. Serverless & distributed. Load test with Playwright. Load test HTTP APIs, GraphQL, WebSocket, and more. Use any Node.js module.
https://www.artillery.io
Mozilla Public License 2.0
8.04k stars 511 forks source link

A Lambda function has exited with an error. Reason: ArtilleryError #2178

Closed carrejoe3 closed 1 year ago

carrejoe3 commented 1 year ago

Version info:

2.0.0-37

Running this command:

npx artillery run --platform aws:lambda --platform-opt region=eu-west-2 --count 10 load-test.yml -e test -o reports/report.json

I expected to see this happen:

I am running the command above inside a CodeBuild project. I'm trying to generate 2200 RPS using the test file below.

Instead, this happened:

For each of the workers (10 in my case) I get the following output before the test begins:


[8d420a65-d1d0-4c38-adb1-cbb5c2387be8]: A Lambda function has exited with an error. Reason: ArtilleryError
--
175 | {
176 | stdout: 'Test run id: t5tzx_m5ppfy4wnx9abaxfhwwdb7a5hrpf6_dqk5\n' +
177 | 'Phase started: constant 2200 requests for 5 minutes (index: 0, duration: 300s) 14:33:16(+0000)\n' +
178 | '\n' +
179 | "worker error, id: 1 Error: EMFILE: too many open files, open '/var/task/node_modules/artillery/node_modules/@artilleryio/sketches-js/dist/ddsketch/proto/compiled.js'\n" +
180 | '    at Object.openSync (node:fs:590:3)\n' +
181 | '    at Object.readFileSync (node:fs:458:35)\n' +
182 | '    at Object.Module._extensions..js (node:internal/modules/cjs/loader:1215:18)\n' +
183 | '    at Module.load (node:internal/modules/cjs/loader:1076:32)\n' +
184 | '    at Function.Module._load (node:internal/modules/cjs/loader:911:12)\n' +
185 | '    at Module.require (node:internal/modules/cjs/loader:1100:19)\n' +
186 | '    at require (node:internal/modules/cjs/helpers:119:18)\n' +
187 | '    at DDSketch.BaseDDSketch.toProto (/var/task/node_modules/artillery/node_modules/@artilleryio/sketches-js/dist/ddsketch/DDSketch.js:156:29)\n' +
188 | '    at Function.serializeMetrics (/var/task/node_modules/artillery/node_modules/@artilleryio/int-core/lib/ssms.js:345:23)\n' +
189 | '    at SSMS.<anonymous> (/var/task/node_modules/artillery/node_modules/@artilleryio/int-core/lib/runner.js:237:49) {\n' +
190 | '  errno: -24,\n' +
191 | "  syscall: 'open',\n" +
192 | "  code: 'EMFILE',\n" +
193 | "  path: '/var/task/node_modules/artillery/node_modules/@artilleryio/sketches-js/dist/ddsketch/proto/compiled.js'\n" +
194 | '}\n',
195 | stderr: ''
196 | }

The load test then runs, however the desired RPS isn't reached and there are a lot of EMFILE errors:

--------------------------------------
--
385 | Metrics for period to: 14:33:20(+0000) (width: 3.853s)
386 | --------------------------------------
387 |  
388 | errors.EMFILE: ................................................................. 1276
389 | http.codes.200: ................................................................ 991
390 | http.downloaded_bytes: ......................................................... 1982
391 | http.request_rate: ............................................................. 596/sec
392 | http.requests: ................................................................. 2268
393 | http.response_time:
394 | min: ......................................................................... 2
395 | max: ......................................................................... 75
396 | median: ...................................................................... 24.8
397 | p95: ......................................................................... 44.3
398 | p99: ......................................................................... 54.1
399 | http.responses: ................................................................ 991
400 | plugins.metrics-by-endpoint.response_time.status page:
401 | min: ......................................................................... 2
402 | max: ......................................................................... 75
403 | median: ...................................................................... 24.8
404 | p95: ......................................................................... 44.3
405 | p99: ......................................................................... 54.1
406 | plugins.metrics-by-endpoint.status page.codes.200: ............................. 991
407 | vusers.completed: .............................................................. 991
408 | vusers.created: ................................................................ 2269
409 | vusers.created_by_name.status page: ............................................ 2269
410 | vusers.failed: ................................................................. 1276
411 | vusers.session_length:
412 | min: ......................................................................... 55
413 | max: ......................................................................... 620.3
414 | median: ...................................................................... 165.7
415 | p95: ......................................................................... 399.5
416 | p99: ......................................................................... 528.6

Files being used:

config:
  phases:
    - name: 'constant 2200 requests for 5 minutes'
      duration: 300
      arrivalRate: 2200

  environments:
    test:
      target: 'some website'

  plugins:
    metrics-by-endpoint:
      useOnlyRequestNames: true
      stripQueryString: true
      ignoreUnnamedRequests: false
      metricsPrefix: 'endpoint'

    publish-metrics:
      - type: cloudwatch
        region: eu-west-2
        namespace: load-testing
        name: codebuild
        dimensions:
          - name: configuration
            value: 'codebuild'

scenarios:
  - name: 'status page'
    flow:
      - get:
          name: 'status page'
          url: '/status'
bernardobridge commented 1 year ago

Hi @carrejoe3 👋,

Artillery needs open file descriptors at certain points of the test to work, and Lambda has a limit of 1024 file descriptors. Essentially, you're attempting to run too much arrivalRate on each Lambda function, causing EMFILE issues.

Try to distribute your test over more workers (potentially substantially more) by increasing --count, and it should work.

carrejoe3 commented 1 year ago

Hi @carrejoe3 👋,

Artillery needs open file descriptors at certain points of the test to work, and Lambda has a limit of 1024 file descriptors. Essentially, you're attempting to run too much arrivalRate on each Lambda function, causing EMFILE issues.

Try to distribute your test over more workers (potentially substantially more) by increasing --count, and it should work.

Thanks @bernardobridge, is there any sort of rough mapping of how many workers you need per 100 requests? I know it would depend on the connection times but any info would be great, thanks.

bernardobridge commented 1 year ago

Hey @carrejoe3, it depends more on how heavy the requests and custom logic is. So the right answer is you may need to do a little bit of trial and error to get the sweet spot right for your test.

In a previous company where I used Artillery, we used to limit things to no more than 200-250 requests per worker, but that was in Fargate which has different and more generous limits, and we were also being rather conservative.

There should be no problem with you just going quite high on the workers though, as Lambda is quite scalable and cheap, and you aren't running for a long time. So in your position I'd probably go 5x on the workers (and 5x less on arrivalRate per worker), and tune up or down from there.

carrejoe3 commented 1 year ago

Hey @carrejoe3, it depends more on how heavy the requests and custom logic is. So the right answer is you may need to do a little bit of trial and error to get the sweet spot right for your test.

In a previous company where I used Artillery, we used to limit things to no more than 200-250 requests per worker, but that was in Fargate which has different and more generous limits, and we were also being rather conservative.

There should be no problem with you just going quite high on the workers though, as Lambda is quite scalable and cheap, and you aren't running for a long time. So in your position I'd probably go 5x on the workers (and 5x less on arrivalRate per worker), and tune up or down from there.

Awesome okay thanks for the info. So requests aren't automatically split between workers, you have to divide the requests by the amount of workers you have to achieve the desired RPS?

bernardobridge commented 1 year ago

Yes @carrejoe3, that's correct. Each Lambda or Fargate worker will run one copy of the Artillery script you define.

So if your intention was to get 2200 arrivalRate total, you need much less workers, you could try 220 with 10 workers to start with, for example.