Azure-Samples / jmeter-aci-terraform

Scalable cloud load/stress testing pipeline solution with Apache JMeter and Terraform to dynamically provision and destroy the required infrastructure on Azure.
MIT License
119 stars 99 forks source link

Execution of tests is going into infinite loop #61

Open karthik-mut opened 3 years ago

karthik-mut commented 3 years ago

Hi, Sometimes the execution of tests is going into an infinite loop. In the build agent, we are getting a message at the execution of the test step as Still Running... Still Running... We have configured it to execute it for a Duration of 600 Seconds. Can you please let us know how to debug this issue further?

hepsi204 commented 3 years ago

Hello,

This log line is outputted when the pipeline is waiting for the state of the JMETER_CONTROLLER container instance to change from RUNNING to a different state.

Note Line 84 in file - azure-pipelines.load-test.yml (please see below)

while [ $(az container show -g $RG -n $NAME --query "containers[0].instanceView.currentState.state" -o tsv) == "Running" ]; do
    echo "`date`: Still Running..."
    sleep 20
done
echo "`date`: Finished!"

The state of JMETER_CONTROLLER container instance does not change until all the JMETER_WORKER container instances have finished the test runs or a timeout has been reached.

In the pipeline logs, you should be able to see JMETER_WORKER instances outputting log lines similar to below if they have finished a test run:

Starting the test on host xx.xx.xx.xx @ Sun Oct 25 17:50:21
Finished the test on host xx.xx.xx.xx @ Sun Oct 25 17:50:25

These lines are logged when stage with display name RESULTS: Collect JMeter Controller and Worker Logs finishes.

If a JMETER_WORKER container instance did not finish the test run, you won't find the 'Finished' state being outputted to the pipeline logs.

One of the things it could be is that JMETER_WORKER container instance is running too many threads or ramping up too many threads too quickly, it could cause resource exhaustion on that container instance. It means that a test run could have finished in 600 seconds if it had the required amount of resources or had less threads to ramp up per time unit.

Hope this helps.