Open dmitry-lomakin opened 3 years ago
Hi @dmitry-lomakin, I am glad you are trying this out, here are my suggestions:
S1: Are you in VPN? Sometimes I have seen this happening when using it with VPN, try disconnecting from VPN.
S2: Try increasing the value of timeout_seconds
attribute in AWS ALB. I modified timeout_seconds
attribute in my load balancer and you can do the same by following the below steps:
Idle timeout
to 4000
(thats the max value). After the updates, I ran 40 concurrent session of sample test case that's checked in part of the GitHub repo, in a loop and I could see at-least one session running in each of these nodes.
Test batch 1 (20 concurrent run - in view):
Test batch 2 (20 concurrent run - in view):
Grid console:
Hope this helps. Thanks
This same error occurs for me too.
Not using a VPN
ALB timeout is 60 seconds. Error occurs within 5 or 6 seconds of running the test.
Using this project gives a remoteHost of 'null'
This same error occurs for me too.
Not using a VPN
ALB timeout is 60 seconds. Error occurs within 5 or 6 seconds of running the test.
Using this project gives a remoteHost of 'null'
Looks like the data nodes are not getting registered to master node. Ideally, when the data node starts up, it will use the below environment properties and startup commands to register itself to the master node:
env: {
HUB_PORT_4444_TCP_ADDR: options.loadBalancer.loadBalancerDnsName,
HUB_PORT_4444_TCP_PORT: '4444',
NODE_MAX_INSTANCES: this.seleniumNodeMaxInstances.toString(),
NODE_MAX_SESSION: this.seleniumNodeMaxSessions.toString(),
SE_OPTS: '-debug',
shm_size: '512',
},
PRIVATE=$(curl -s http://169.254.170.2/v2/metadata | jq -r '.Containers[1].Networks[0].IPv4Addresses[0]') ; export REMOTE_HOST=\"http://$PRIVATE:5555\" ; /opt/bin/entry_point.sh
Can you check the logs of the data nodes to see for any errors?
I'm getting the same errors, but seeing nothing on the logs other than the fact that the remoteHost url is null:
12:45:25.230 DEBUG [RegistrationServlet.process] - getting the following registration request : {
"class": "org.openqa.grid.common.RegistrationRequest",
"configuration": {
"browserTimeout": 200000,
"capabilities": [
{
"applicationName": "",
"browserName": "chrome",
"maxInstances": 500,
"platform": "LINUX",
"platformName": "LINUX",
"seleniumProtocol": "WebDriver",
"server:CONFIG_UUID": "fee0fc50-c652-4615-be0e-9159f3d6199e",
"version": "94.0.4606.61"
}
],
"custom": {
},
"debug": true,
"downPollingLimit": 2,
"enablePlatformVerification": true,
"host": "10.0.165.168",
"hub": "http:\u002f\u002ftestin-Selen-qidwISh9glTz-1519287742.eu-west-2.elb.amazonaws.com:4444\u002fgrid\u002fregister",
"id": "http:\u002f\u002fnull:5555",
"maxSession": 500,
"nodePolling": 5000,
"nodeStatusCheckTimeout": 5000,
"port": 5555,
"proxy": "org.openqa.grid.selenium.proxy.DefaultRemoteProxy",
"register": true,
"registerCycle": 5000,
"remoteHost": "http:\u002f\u002fnull:5555",
"role": "node",
"servlets": [
],
"timeout": 180,
"unregisterIfStillDownAfter": 60000,
"withoutServlets": [
]
},
"description": null,
"name": null
}
12:45:25.231 DEBUG [BaseRemoteProxy.getNewInstance] - Using class org.openqa.grid.selenium.proxy.DefaultRemoteProxy
12:45:25.232 DEBUG [BaseRemoteProxy.setupTimeoutListener] - starting cleanup thread
12:45:25.233 DEBUG [BaseRemoteProxy$CleanUpThread.run] - cleanup thread starting...
12:45:25.233 INFO [DefaultGridRegistry.add] - Registered a node http://null:5555
12:45:25.233 DEBUG [RegistrationServlet.lambda$process$0] - proxy added http://null:5555
12:45:30.235 DEBUG [DefaultRemoteProxy.isAlive] - Failed to check status of node: null: Name or service not known
12:45:35.235 DEBUG [DefaultRemoteProxy.isAlive] - Failed to check status of node: null
12:45:35.235 INFO [DefaultRemoteProxy.onEvent] - Marking the node http://null:5555 as down: cannot reach the node for 2 tries
12:45:40.237 DEBUG [DefaultRemoteProxy.isAlive] - Failed to check status of node: null: Name or service not known
12:45:45.238 DEBUG [DefaultRemoteProxy.isAlive] - Failed to check status of node: null
12:45:50.240 DEBUG [DefaultRemoteProxy.isAlive] - Failed to check status of node: null: Name or service not known
12:45:55.241 DEBUG [DefaultRemoteProxy.isAlive] - Failed to check status of node: null
12:46:00.242 DEBUG [DefaultRemoteProxy.isAlive] - Failed to check status of node: null: Name or service not known
12:46:05.243 DEBUG [DefaultRemoteProxy.isAlive] - Failed to check status of node: null
12:46:10.244 DEBUG [DefaultRemoteProxy.isAlive] - Failed to check status of node: null: Name or service not known
12:46:15.245 DEBUG [DefaultRemoteProxy.isAlive] - Failed to check status of node: null
12:46:20.247 DEBUG [DefaultRemoteProxy.isAlive] - Failed to check status of node: null: Name or service not known
12:46:25.247 DEBUG [DefaultRemoteProxy.isAlive] - Failed to check status of node: null
12:46:30.249 DEBUG [DefaultRemoteProxy.isAlive] - Failed to check status of node: null: Name or service not known
12:46:35.249 DEBUG [DefaultRemoteProxy.isAlive] - Failed to check status of node: null
12:46:35.249 INFO [DefaultRemoteProxy.onEvent] - Unregistering the node http://null:5555 because it's been down for 60014 milliseconds
12:33:44.476 DEBUG [SelfRegisteringRemote.registerToHub] - Updated node configuration: {
"browserTimeout": 200000,
"capabilities": [
{
"applicationName": "",
"browserName": "chrome",
"maxInstances": 500,
"platform": "LINUX",
"platformName": "LINUX",
"seleniumProtocol": "WebDriver",
"server:CONFIG_UUID": "fee0fc50-c652-4615-be0e-9159f3d6199e",
"version": "94.0.4606.61"
}
],
"custom": {
},
"debug": true,
"downPollingLimit": 2,
"enablePlatformVerification": true,
"host": "10.0.165.168",
"hub": "http:\u002f\u002ftestin-Selen-qidwISh9glTz-1519287742.eu-west-2.elb.amazonaws.com:4444\u002fgrid\u002fregister",
"id": "http:\u002f\u002fnull:5555",
"maxSession": 500,
"nodePolling": 5000,
"nodeStatusCheckTimeout": 5000,
"port": 5555,
"proxy": "org.openqa.grid.selenium.proxy.DefaultRemoteProxy",
"register": true,
"registerCycle": 5000,
"remoteHost": "http:\u002f\u002fnull:5555",
"role": "node",
"servlets": [
],
"timeout": 180,
"unregisterIfStillDownAfter": 60000,
"withoutServlets": [
]
}
12:33:44.476 INFO [SelfRegisteringRemote.registerToHub] - Registering the node to the hub: http://testin-Selen-qidwISh9glTz-1519287742.eu-west-2.elb.amazonaws.com:4444/grid/register
12:33:44.486 INFO [SelfRegisteringRemote.registerToHub] - The node is registered to the hub and ready to use
12:34:54.550 DEBUG [SelfRegisteringRemote.registerToHub] - Fetching browserTimeout and timeout values from the hub before sending registration request
To me, this suggests this line https://github.com/aws-samples/run-selenium-tests-at-scale-using-ecs-fargate/blob/main/lib/index.js#L135 is failing to read the ip address correctly and thus returning a broken host. But I haven't been able to debug it yet, I'll try to figure it out tomorrow, but if someone already knows the answer (or I'm totally misled), I'd appreciate them sharing it.
Forgot to update this, but I managed to get this working by replacing the aforementioned line of code with:
command: ["PRIVATE=$(curl -s $ECS_CONTAINER_METADATA_URI_V4 | jq -r '.Networks[0].IPv4Addresses[0]') ; export REMOTE_HOST=\"http://$PRIVATE:5555\" ; /opt/bin/entry_point.sh"],
Running tests doesn't seem to work at my machine:
Any clue on what could break?