aws-solutions / distributed-load-testing-on-aws

Distributed Load Testing on AWS
https://aws.amazon.com/solutions/implementations/distributed-load-testing-on-aws/
Other
341 stars 121 forks source link

ParseResults failing for larger reports #81

Closed kstoner-sbux closed 11 months ago

kstoner-sbux commented 2 years ago

Describe the bug For long running tests with a lot of API's (what Amazonians call a "gameday" test) test-results are not being logged in Dynamo and the Test Status message on the dashboard reads:

Test Failed Failed to parse the results.

UPDATE: I am wondering if this is related to the 400KB item size limit in DynamoDB. Since the DLT framework is storing all of a test's history in the same item as the rest of the test's metadata the item size will grow over time. This means that even small tests will eventually hit this bug.

SUGGESTION: The history for a test should be stored in individual test-history items that relate to the parent test in a one-to-many relationship.

To Reproduce In my test I am executing 40+ API's for a combined load of 4k+ TPS for approximately 2 hours. I have attached the offending results.xml file with obfuscated Group names.

Run a test with multiple samplers in the same JMX file over and over again until the test's DynamoDB item size starts to approach 400 KB.3

Expected behavior The method should not fail to parse a properly formatted results file.

Please complete the following information about the solution:

Screenshots

Additional context

results.xml.txt

emcfins commented 2 years ago

Thank you for submitting this bug. We will look into it and add it to our backlog.

kstoner-sbux commented 2 years ago

UPDATE: This appears to me to be a serious bug that everyone will hit at some point. The more an individual test is run, to quicker that the test will start to fail with the described failure in the ParseResults.

emcfins commented 2 years ago

Thank you for this update. We are looking to address this issue in upcoming releases.

jmnxno commented 2 years ago

I've been hitting this big time. I've tried to work around it, but no luck yet... Any ideas or suggestions would be helpful. Thanks.

G-Lenz commented 2 years ago

We have addressed this in V3.0.0. We have moved the test history into a separate DynamoDB table.

shanewarnerthrive commented 2 years ago

Still seeing this issue on v3.0.0. Ran 10 tasks with 60 concurrent with a 5 minute ramp up and 60 minute hold time. .log output files are ~450KB in size.

zxkane commented 2 years ago

Met the same problem when using v3.0.0. Lots of empty jmeter xml generated in test result bucket.

image

Found below logs from the out of jmeter task runner, looks like there is no disk space left in fargate runner

Created the tree successfully using /tmp/artifacts/modified_requests.jmx
Starting standalone test @ Mon Oct 24 07:45:19 UTC 2022 (1666597519397)
Waiting for possible Shutdown/StopTestNow/HeapDump/ThreadDump message on port 4445
2022-10-24 08:06:20,268 simple countly event by Restful API-ThreadStarter 1-9 ERROR Unable to write to stream /tmp/artifacts/jmeter.log for appender jmeter-log org.apache.logging.log4j.core.appender.AppenderLoggingException: Error writing to stream /tmp/artifacts/jmeter.log
    at org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStreamManager.java:252)
    at org.apache.logging.log4j.core.appender.FileManager.writeToDestination(FileManager.java:277)
    at org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager.java:283)
    at org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:294)
    at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(AbstractOutputStreamAppender.java:199)
    at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutputStreamAppender.java:190)
    at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:181)
    at org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:161)
    at org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:134)
    at org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:125)
    at org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:89)
    at org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:542)
    at org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:500)
    at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:483)
    at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:417)
    at org.apache.logging.log4j.core.config.AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:82)
    at org.apache.logging.log4j.core.Logger.log(Logger.java:161)
    at org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogger.java:2205)
    at org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(AbstractLogger.java:2159)
    at org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2142)
    at org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:2034)
    at org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1899)
    at org.apache.logging.slf4j.Log4jLogger.info(Log4jLogger.java:184)
    at org.apache.jmeter.threads.JMeterThread.run(JMeterThread.java:298)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.io.IOException: No space left on device
    at java.base/java.io.FileOutputStream.writeBytes(Native Method)
    at java.base/java.io.FileOutputStream.write(FileOutputStream.java:354)
    at org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStreamManager.java:250)
    ... 24 more

2022-10-24 08:06:20,271 simple countly event by Restful API-ThreadStarter 1-8 ERROR Unable to write to stream /tmp/artifacts/jmeter.log for appender jmeter-log org.apache.logging.log4j.core.appender.AppenderLoggingException: Error writing to stream /tmp/artifacts/jmeter.log
    at org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStreamManager.java:252)
    at org.apache.logging.log4j.core.appender.FileManager.writeToDestination(FileManager.java:277)
    at org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager.java:283)
    at org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:294)
    at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(AbstractOutputStreamAppender.java:199)
    at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutputStreamA
kamyarz-aws commented 11 months ago

Closing Due to Inactivity. If the issue persists please open this and we investigate.

bogdankatishev commented 8 months ago

Hello @kamyarz-aws ,

we are hitting on this issue on v3.2.5, all of our "bigger" tests are failing with the same error message: Failed to parse the results. Because we are hitting on the maximum item size in DynamoDB, which is 400 KB.

kamyarz-aws commented 8 months ago

@bogdankatishev hmm that is a limitation that we need to address of course. I will evaluate this and get back to you. In the meantime can you share the details of the test. How many task, how many concurrency, what regions, to see what kind of solution I have to work on.

bogdankatishev commented 7 months ago

@kamyarz-aws We were running a k6 test inside taurus, I will try to reproduce it with an JMX test. But we notice that the Group label key in the results.xml is the culprit. Since we call an endpoint with different and unique query parameters each time, resulting in a new Group label entry in the xml file. (In our example, we have 264 entries of the Group label key in the final results.xml file) The results.xml exceeds the 400kb fast this way.

And then DLT fails to write the results to the DynamoDB History Table.

image

bogdankatishev commented 6 months ago

Hello @kamyarz-aws, could this issue be re-opened? And is there any new update(s) regarding this issue?

kamyarz-aws commented 6 months ago

@bogdankatishev I have created ticket on our side and will address it in our next minor release. In case you want visibility how about you create new issue for this, cause issues above are addressed before but 400kb size limit needs to be addressed

bogdankatishev commented 4 months ago

Hello @kamyarz-aws,

Any ETA for the next minor release that addresses these issues?

kamyarz-aws commented 4 months ago

@bogdankatishev Can you please provide us steps to reproduce. This is in our backlog to be fixed for a future release.