microsoft / azure-load-testing

MIT License
22 stars 3 forks source link

[BUG] - Load test YAML gets corrupted on upload. #140

Open Shaun-Regan opened 1 year ago

Shaun-Regan commented 1 year ago

Describe the bug Load test YAML gets corrupted on upload. Failing/aborted Load Test is started using the Azure Load Testing v1 task in Azure Devops as part of a Release Pipeline (v1.2.20). Load test aborts after about 4 mins with this error is the test detail :

"Test run failed due to error in test run configuration. Please try again. If the issue persists, please raise a support ticket along with the test run id."

example runid testRunId 820afe4e-0960-4641-b791-0b7976763837

Test time summary on the step shows it appears to continue to run for over 3 hours. Screenshot 2023-01-31 133139

To Reproduce

  1. Start with a correctly configure YAML (Our YAML in our repo is correct)
  2. Start load test using the Azure Load Testing v1 task in Azure Devops as part of a Release Pipeline (v1.2.20)
  3. Test aborts
  4. Download the input files after test fails
  5. Review YAML in the input file zip
  6. Note that one of the failureCriteria has been corrupted during copy
  7. input file zip YAML is not a copy of the YAML in our repo.
  8. testRunId 820afe4e-0960-4641-b791-0b7976763837 is an example test with this issue

Expected behavior YAML not to be changed by the AZURE run test step in azure devops.

Screenshots REPO YAML Screenshot 2023-01-31 132103

YAML from the downloaded Input file zip from the failed/aborted test. Screenshot 2023-01-31 132230

Additional context This issue maybe related to issue https://github.com/microsoft/azure-load-testing/issues/136

AB#1734561

ninallam commented 1 year ago

@Shaun-Regan Thanks for reaching out. We will look into this? Meanwhile can you help us with details of what are you seeing in the Azure Portal for this test run? Did it fail? If yes, what is the error message?

Shaun-Regan commented 1 year ago

Load test aborts after about 4 mins with this error is the test detail :

"Test run failed due to error in test run configuration. Please try again. If the issue persists, please raise a support ticket along with the test run id."

Screenshot 2023-01-31 144918 Screenshot 2023-01-31 144631

ninallam commented 1 year ago

@Shaun-Regan Just confirming my understanding, the test run failed in 4 min on portal but the pipeline ran for 3+ hours. Additionally the YAML file is corrupted.

Can you please also share what was the duration for which you configured the test?

Shaun-Regan commented 1 year ago

@Shaun-Regan Just confirming my understanding, the test run failed in 4 min on portal but the pipeline ran for 3+ hours. Additionally the YAML file is corrupted.

Your understanding is correct.

Can you please also share what was the duration for which you configured the test?

The test is configured to run for 20 mins. Screenshot 2023-01-31 174028

ninallam commented 1 year ago

@Shaun-Regan We looked into this issue and found that the memory and the CPU for the test engine exceeded the capacity which made JMeter crash. Could you please reduce the number of threads per engine and configure more engines instead to achieve your target load?

Shaun-Regan commented 1 year ago

Hi, This seems like 100 users would not cause the issue. Here is my thinking why:

1.) The input files YAML is not in valid format (why would dropping the number of users resolve this) 2.) I have very similar tests executing successfully with 100 concurrent users. 3.) I can execute the exact same jmx in blazemeter using UK South - (London, Azure) as the location without issue
Screenshot 2023-02-01 143313 4.) Feels like 100 concurrent users is not very many for a load engine. 5.) I was under the impression that a load test engine should be capable of around 250 users each. 6.) If a load engine has an issue then the test step should abort, report the issue in the error detail and not continue to appear to execute for over 3 hours?

Even if dropping the users works, it feels like a workaround rather than an actual bugfix.

I will try dropping the users for this one test and see if that resolves the YAML issue and the test runs successfully. I'll let you know the results and the change.

Cheers

Shaun-Regan commented 1 year ago

hi, the test ran for 20 mins with only 50 users. I've looked at the YAML and it still looks incorrect. Please See screenshot below. Screenshot 2023-02-02 160440

Cheers

ninallam commented 1 year ago

@Shaun-Regan Thanks for confirming that the test ran. The YAML file that is exported is only an output file that is generated. It does not effect the test that you ran. We will fix the syntax of the YAML file. Thanks for pointing that out.

On the test run, you can monitor the engine health metrics to make sure the engines are withing acceptable memory and CPU utilization. We will make sure that in these cases a proper error message is provided.

Sakthi-github-07 commented 1 year ago

Hi @ninallam please help me with the below issue. I have followed your Azure Load testing service video, to run the JMeter script through portal. I have followed exact same steps. I have a JMeter script with csv data config and added necessary plugins , but still im getting the same Failure error. The test seems to be executing for long time and later it shows as failed. "Test run failed due to error in test run configuration. Please try again. If the issue persists, please raise a support ticket along with the test run id."

Whereas if i run a sample script without any CSV data files, it is running and the test is passed. Why these differences? do we have configure anything else while running with CSV data file? I'm trying to create a framework for my organization using Azure Load testing service, is it reliable to create one? Please help

image image