smithmicro / jmeter-ecs

JMeter Docker Image for Distributed Testing on EC2 Container Service (ECS)
Apache License 2.0
43 stars 34 forks source link

Slaves can't connect back to master #30

Closed innokentiyt closed 5 years ago

innokentiyt commented 5 years ago

First, I tried to incorporate 5.0 version of JMeter in local fork. Containers seems to work fine, no crashing, with heap a little bit reduced. Test starts on master, slaves seemingly receive commands from it, but can't connect back to master to send reports,and Lucy stays at:

[01:49:48] Creating summariser <summary>
[01:49:48] Created the tree successfully using /plans/demo.jmx
[01:49:48] Configuring remote engine: 172.30.2.60
[01:49:48] Configuring remote engine: 172.30.2.40
[01:49:48] Starting remote engines
[01:49:48] Starting the test @ Thu Jan 17 01:49:48 GMT 2019 (1547689788786)
[01:49:52] Remote engines have been started
[01:49:52] Waiting for possible Shutdown/StopTestNow/Heapdump message on port 4445

Looking at ECS dashboard I can only see task definitions of slaves. Master's container instance indicates that all memory resources are free ("Registered 985, Available 985"), ports 1099 and 50000 are not in use (or not binded) and there are 0 tasks running. But SSH-logging into the EC2-instance of Gru shows that container is running and the ports are binded:

0070afad1be6        innokentiyt/jmeter:5.0h          "/opt/jmeter/entrypo…"   15 minutes ago      Up 15 minutes       0.0.0.0:1099->1099/tcp, 0.0.0.0:51000->51000/tcp, 4445/udp, 50000/tcp   amazing_elion

Master does can connect to RMI ports of slaves (tested it using telnet and nc from host and from inside the container). Slaves do not. I tried to loosen VPC's Security Group rules, but it not helped.

innokentiyt commented 5 years ago

Using --network host when starting Gru helped. It seems something is different in my VPC configuration.

dsperling commented 5 years ago

I have created a new image for testing 5.0. If you want to give it a try:

docker pull smithmicro/jmeter:5.0
dsperling commented 5 years ago

I can now duplicate this problem with a new AWS VPN created by aws-setup.sh.

The idential setup works with the 4.0 image, but fails with 5.0 exactly as you describe above. I tried running Gru with --network host and it did not work for me.

How did you fix this?

In my research, I found this statement which is not accounted for in the current configuration: https://jmeter.apache.org/usermanual/remote-test.html

By default, RMI uses dynamic ports for the JMeter server engine. This can cause problems for firewalls, so you can define the JMeter property server.rmi.localport to control this port numbers. If this is non-zero, it will be used as the base for local port numbers for the server engine. At the moment JMeter will open up to three ports beginning with the port defined in server.rmi.localport.

dsperling commented 5 years ago

Eureka.

Turns out it was client.rmi.localport=51000 in Gru. Your proposal of --network host was part one of the fix. Part two is expanding the security group to not just include 51000, but a range like 51000-51999. Not sure how many ports the client (Gru) actually needs. A fix for server.rmi.localport=50000 might also be needed, but it seems to work great with the security group just allowing 50000.