Closed dagrooms52 closed 2 years ago
I've seen this same error from the controller container multiple times during this failure, it is failing to connect to the worker[0]
instance which always has IP 10.0.0.4
START Running Jmeter on Mon Sep 27 16:31:37 UTC 2021
JVM_ARGS=-Xmn2908m -Xms11632m -Xmx11632m
jmeter args=-n -J server.rmi.ssl.disable=true -t sample.jmx -l results.jtl -e -o dashboard -R 10.0.0.5,10.0.0.4
Sep 27, 2021 4:31:39 PM java.util.prefs.FileSystemPreferences$1 run
INFO: Created user preferences directory.
Creating summariser <summary>
Created the tree successfully using sample.jmx
Configuring remote engine: 10.0.0.5
Configuring remote engine: 10.0.0.4
Connection refused to host: 10.0.0.4; nested exception is:
java.net.ConnectException: Connection refused (Connection refused)
Failed to configure 10.0.0.4
Stopping remote engines
Remote engines have been stopped
Error in NonGUIDriver java.lang.RuntimeException: Following remote engines could not be configured:[10.0.0.4]
END Running Jmeter on Mon Sep 27 16:31:40 UTC 2021
I'm going to bet it's the same problem as issue #78 I just reported.
The reason is actually due to worker container failing to start JMeter, and because worker doesn't have restart_policy
specified, ACI tries to restart it couple times before giving up which wipes the console log. I was able to catch it by setting restart_policy
to Never
like controller, and watching the ACI console as it runs.
I have no idea how to resolve it yet though.
I'm not sure, I haven't seen these containers come up with a loopback address (127.0.0.1:37683
in your issue). They seem to start with the correct IP, yet the controller isn't able to contact them.
Thanks for the tip on restart_policy
though, I will set that to Never
for workers so I can grab logs and compare to your info.
The worker container actually starts with the right IP, but for some reason the heuristics that JMeter employs isn't able to resolve it. You can also verify by skipping cleanup step on failure, and just manually try to restart the worker container.
You're right, I was able to stop the containers and caught 3 of them reproducing this when deploying 20.
START Running Jmeter on Tue Sep 28 00:04:40 UTC 2021
JVM_ARGS=-Xmn1572m -Xms6288m -Xmx6288m
jmeter args=-s -J server.rmi.ssl.disable=true
Sep 28, 2021 12:04:42 AM java.util.prefs.FileSystemPreferences$1 run
INFO: Created user preferences directory.
Created remote object: UnicastServerRef2 [liveRef: [endpoint:[127.0.0.1:42101](local),objID:[10877288:17c29b7c631:-7fff, -8508851859447852716]]]
Server failed to start: java.rmi.RemoteException: Cannot start. SandboxHost-637683842593770648 is a loopback address.
An error occurred: Cannot start. SandboxHost-637683842593770648 is a loopback address.
See #78 for a possible workaround.
I've had issues with the jmeter-controller container failing to start from this pipeline intermittently.
There are no container logs to be pulled during
'TEST: Wait Test Execution'
, so this step of the pipeline exits immediately with a success code. It may be better to fail at this point in the pipeline by checking if the container is running at the end ofSETUP: Run Terraform Apply (target=all)
.As far as I can tell this is just an issue with Jmeter's distributed worker reliability, there are no code changes between succeeding & failing runs except for a change in the
WORKER_COUNT
variable in the jmeter file.