SeleniumHQ / docker-selenium

Provides a simple way to run Selenium Grid with Chrome, Firefox, and Edge using Docker, making it easier to perform browser automation
http://www.selenium.dev/docker-selenium/
Other
7.99k stars 2.51k forks source link

[🐛 Bug]: Selenium node and hub fail to connect to arbitrary address #2447

Open edison12a opened 3 weeks ago

edison12a commented 3 weeks ago

What happened?

Help: I get a connection error that has a very abnormal address in it. I wonder where the container is getting it from. I can see that much in the logs.

java.net.UnknownHostException: 829c6db0f73a46918aa3a29ec52cd79d-0024402779: 829c6db0f73a46918aa3a29ec52cd79d-0024402779: Name or service not known

This is my local docker compose.

I deployed these containers to AWS with exactly the same properties/config.

version: "3"
services:
  chrome-node:
    image: selenium/node-chrome:4.25.0-20240922
    container_name: chrome-node
    ports:
      - "5555:5555"
      - "7900:7900"
    environment:
      SE_EVENT_BUS_HOST: "selenium-hub"
      SE_EVENT_BUS_PUBLISH_PORT: "4442"
      SE_EVENT_BUS_SUBSCRIBE_PORT: "4443"
      SE_NODE_HOST: "chrome-node"
      SE_NODE_PORT: "5555"
      SE_NODE_MAX_SESSIONS: "5"
      SE_NODE_OVERRIDE_MAX_SESSIONS: "true"
      SE_GRID_URL: "http://selenium-hub:4444"
      SE_ENABLE_TRACING: "false"
    platform: linux/amd64  # Specify platform here

  selenium-hub:
    image: selenium/hub:4.25.0-20240922
    container_name: selenium-hub
    ports:
      - "4442:4442"
      - "4443:4443"
      - "4444:4444"
    environment:
      SE_GRID_MAX_SESSION: "15"
      SE_NODE_PORT: "5555"
      GRID_NEW_SESSION_WAIT_TIMEOUT: "30"
      SE_NODE_SESSION_TIMEOUT: "300"
      SE_GRID_SEASONING_REGISTERING_ATTEMPTS: "5"
      SE_GRID_SEASONING_REGISTERING_RETRY_INTERVAL: "5"
      SE_ENABLE_TRACING: "false"
    platform: linux/amd64  # Specify platform here

Relevant log output

BUT I get this error on the Hub, and the container "shutsdown" later.

Shutdown complete
2024-10-29 23:28:17,826 WARN stopped: selenium-grid-hub (terminated by SIGTERM)
2024-10-29 23:28:16,825 WARN received SIGTERM indicating exit request
2024-10-29 23:28:16,825 INFO waiting for selenium-grid-hub to die
Trapped SIGTERM/SIGINT/x so shutting down supervisord...
23:21:58.408 INFO [Hub.execute] - Started Selenium Hub 4.25.0 (revision 030fcf7918): http://10.0.1.156:4444
23:21:57.649 INFO [UnboundZmqEventBus.<init>] - Event bus ready
23:21:56.648 INFO [UnboundZmqEventBus.<init>] - Sockets created
23:21:56.624 INFO [UnboundZmqEventBus.<init>] - Connecting to tcp://127.0.0.1:4442 and tcp://127.0.0.1:4443
23:21:56.559 INFO [BoundZmqEventBus.<init>] - XPUB binding to [binding to tcp://*:4442, advertising as tcp://127.0.0.1:4442], XSUB binding to [binding to tcp://*:4443, advertising as tcp://127.0.0.1:4443]
23:21:56.554 WARN [HostIdentifier.resolveHostAddress] - Failed to resolve host address
java.net.UnknownHostException: 829c6db0f73a46918aa3a29ec52cd79d-0024402779: 829c6db0f73a46918aa3a29ec52cd79d-0024402779: Name or service not known
at java.base/java.net.InetAddress.getLocalHost(InetAddress.java:1671)
at org.openqa.selenium.net.HostIdentifier.resolveHostAddress(HostIdentifier.java:104)
at org.openqa.selenium.net.HostIdentifier.getHostAddress(HostIdentifier.java:123)
at org.openqa.selenium.net.DefaultNetworkInterfaceProvider.<init>(DefaultNetworkInterfaceProvider.java:54)
at org.openqa.selenium.net.NetworkUtils.<init>(NetworkUtils.java:48)
at org.openqa.selenium.events.zeromq.BoundZmqEventBus.<init>(BoundZmqEventBus.java:47)
at org.openqa.selenium.events.zeromq.ZeroMqEventBus.create(ZeroMqEventBus.java:49)
at org.openqa.selenium.events.zeromq.ZeroMqEventBus.create(ZeroMqEventBus.java:91)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:569)
at org.openqa.selenium.grid.config.ClassCreation.callCreateMethod(ClassCreation.java:51)
at org.openqa.selenium.grid.config.MemoizedConfig.lambda$getClass$4(MemoizedConfig.java:104)
at java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1708)
at org.openqa.selenium.grid.config.MemoizedConfig.getClass(MemoizedConfig.java:99)
at org.openqa.selenium.grid.server.EventBusOptions.createBus(EventBusOptions.java:51)
at org.openqa.selenium.grid.server.EventBusOptions.getEventBus(EventBusOptions.java:41)
at org.openqa.selenium.grid.commands.Hub.createHandlers(Hub.java:119)
at org.openqa.selenium.grid.TemplateGridServerCommand.asServer(TemplateGridServerCommand.java:47)
at org.openqa.selenium.grid.commands.Hub.execute(Hub.java:233)
at org.openqa.selenium.grid.TemplateGridCommand.lambda$configure$4(TemplateGridCommand.java:122)
at org.openqa.selenium.grid.Main.launch(Main.java:83)
at org.openqa.selenium.grid.Main.go(Main.java:56)
at org.openqa.selenium.grid.Main.main(Main.java:41)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:569)
at org.openqa.selenium.grid.Bootstrap.runMain(Bootstrap.java:77)
at org.openqa.selenium.grid.Bootstrap.main(Bootstrap.java:70)
Caused by: java.net.UnknownHostException: 829c6db0f73a46918aa3a29ec52cd79d-0024402779: Name or service not known
at java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.base/java.net.InetAddress$PlatformNameService.lookupAllHostAddr(InetAddress.java:934)
at java.base/java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1543)
at java.base/java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:852)
at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1533)
at java.base/java.net.InetAddress.getLocalHost(InetAddress.java:1666)
... 30 more
23:21:56.506 INFO [LoggingOptions.getTracer] - Using null tracer
23:21:56.503 INFO [LoggingOptions.configureLogEncoding] - Using the system default encoding
Oct 29, 2024 11:21:56 PM org.openqa.selenium.grid.config.TomlConfig <init>
WARNING: Please use quotes to denote strings. Upcoming TOML parser will require this and unquoted strings will throw an error in the future
Appending Selenium option: --tracing false
Tracing is disabled
2024-10-29 23:21:56,019 INFO success: selenium-grid-hub entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
Appending Selenium option: --log-level INFO
    Appending Selenium option: --http-logs false
    Appending Selenium option: --structured-logs false
    Appending Selenium option: --reject-unsupported-caps false
2024-10-29 23:21:56,013 INFO spawned: 'selenium-grid-hub' with pid 30
2024-10-29 23:21:55,011 INFO supervisord started with pid 7
2024-10-29 23:21:55,010 INFO RPC interface 'supervisor' initialized
2024-10-29 23:21:55,007 INFO Included extra file "/etc/supervisor/conf.d/selenium-grid-hub.conf" during parsing

and on the chrome node, I get this similar error

Shutdown complete
2024-10-29 23:56:37,618 WARN stopped: xvfb (terminated by SIGTERM)
2024-10-29 23:56:36,617 WARN stopped: vnc (terminated by SIGTERM)
2024-10-29 23:56:35,615 WARN stopped: novnc (terminated by SIGTERM)
2024-10-29 23:56:35,615 INFO waiting for xvfb, vnc to die
2024-10-29 23:56:32,611 WARN received SIGTERM indicating exit request
2024-10-29 23:56:32,611 INFO waiting for xvfb, vnc, novnc to die
Trapped SIGTERM/SIGINT/x so shutting down supervisord...
23:23:29.672 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
23:23:19.667 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
23:23:09.663 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
23:22:59.657 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
23:22:49.638 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
23:22:39.627 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
23:22:29.620 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
23:22:29.557 INFO [NodeServer.execute] - Started Selenium node 4.25.0 (revision 030fcf7918): http://chrome-node:5555
23:22:29.551 INFO [NodeServer$1.start] - Starting registration process for Node http://chrome-node:5555
23:22:29.093 INFO [Node.<init>] - Binding additional locator mechanisms: relative
23:22:29.070 INFO [NodeOptions.report] - Adding chrome for {"browserName": "chrome","browserVersion": "129.0","goog:chromeOptions": {"binary": "\u002fusr\u002fbin\u002fgoogle-chrome"},"platformName": "linux","se:containerName": "","se:noVncPort": 7900,"se:vncEnabled": true} 5 times
23:22:28.973 WARN [NodeOptions.getSessionFactories] - Max sessions set to 5
23:22:28.972 WARN [NodeOptions.getSessionFactories] - One browser session is recommended per available processor. Safari is always limited to 1 session per host.
23:22:28.973 WARN [NodeOptions.getSessionFactories] - Overriding this value for Internet Explorer is not recommended. Issues related to parallel testing with Internet Explored won't be accepted.
23:22:28.973 WARN [NodeOptions.getSessionFactories] - Double check if enabling 'override-max-sessions' is really needed
23:22:28.971 INFO [NodeOptions.getSessionFactories] - Detected 2 available processors
23:22:28.972 WARN [NodeOptions.getSessionFactories] - Overriding max recommended number of 2 concurrent sessions. Session stability and reliability might suffer!

23:22:28.939 WARN [HostIdentifier.resolveHostAddress] - Failed to resolve host address
java.net.UnknownHostException: 2c623b64135f4e0a896481f49d115074-2545354646: 2c623b64135f4e0a896481f49d115074-2545354646: Name or service not known
at java.base/java.net.InetAddress.getLocalHost(InetAddress.java:1671)
at org.openqa.selenium.net.HostIdentifier.resolveHostAddress(HostIdentifier.java:104)
at org.openqa.selenium.net.HostIdentifier.getHostAddress(HostIdentifier.java:123)
at org.openqa.selenium.net.DefaultNetworkInterfaceProvider.<init>(DefaultNetworkInterfaceProvider.java:54)
at org.openqa.selenium.net.NetworkUtils.<init>(NetworkUtils.java:48)
at org.openqa.selenium.grid.node.config.NodeOptions.getPublicGridUri(NodeOptions.java:141)
at org.openqa.selenium.grid.node.local.LocalNodeFactory.create(LocalNodeFactory.java:65)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:569)
at org.openqa.selenium.grid.config.ClassCreation.callCreateMethod(ClassCreation.java:51)
at org.openqa.selenium.grid.config.MemoizedConfig.lambda$getClass$4(MemoizedConfig.java:104)
at java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1740)
at org.openqa.selenium.grid.config.MemoizedConfig.getClass(MemoizedConfig.java:99)
at org.openqa.selenium.grid.node.config.NodeOptions.getNode(NodeOptions.java:181)
at org.openqa.selenium.grid.node.httpd.NodeServer.createHandlers(NodeServer.java:126)
at org.openqa.selenium.grid.node.httpd.NodeServer.asServer(NodeServer.java:185)
at org.openqa.selenium.grid.node.httpd.NodeServer.execute(NodeServer.java:247)
at org.openqa.selenium.grid.TemplateGridCommand.lambda$configure$4(TemplateGridCommand.java:122)
at org.openqa.selenium.grid.Main.launch(Main.java:83)
at org.openqa.selenium.grid.Main.go(Main.java:56)
at org.openqa.selenium.grid.Main.main(Main.java:41)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:569)
at org.openqa.selenium.grid.Bootstrap.runMain(Bootstrap.java:77)
at org.openqa.selenium.grid.Bootstrap.main(Bootstrap.java:70)
Caused by: java.net.UnknownHostException: 2c623b64135f4e0a896481f49d115074-2545354646: Name or service not known
at java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.base/java.net.InetAddress$PlatformNameService.lookupAllHostAddr(InetAddress.java:934)
at java.base/java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1543)
at java.base/java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:852)
at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1533)
at java.base/java.net.InetAddress.getLocalHost(InetAddress.java:1666)
... 28 more
23:22:28.892 INFO [LoggingOptions.getTracer] - Using null tracer
23:22:28.886 INFO [NodeServer.createHandlers] - Reporting self as: http://chrome-node:5555
23:22:28.662 INFO [UnboundZmqEventBus.<init>] - Event bus ready
23:22:27.653 INFO [UnboundZmqEventBus.<init>] - Sockets created
23:22:27.245 INFO [LoggingOptions.getTracer] - Using null tracer
23:22:27.239 INFO [LoggingOptions.configureLogEncoding] - Using the system default encoding
Oct 29, 2024 11:22:27 PM org.openqa.selenium.grid.config.TomlConfig <init>
WARNING: Please use quotes to denote strings. Upcoming TOML parser will require this and unquoted strings will throw an error in the future
2024-10-29 23:22:26,778 INFO success: novnc entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2024-10-29 23:22:26,777 INFO success: xvfb entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2024-10-29 23:22:26,777 INFO success: vnc entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
Starting Selenium Grid Node...
[events]
[server]
host = "chrome-node"
port = "5555"
[node]
session-timeout = "300"
override-max-sessions = true
detect-drivers = false
drain-after-session-count = 0
max-sessions = 5
[[node.driver-configuration]]
display-name = "chrome"
stereotype = '{"browserName": "chrome", "browserVersion": "129.0", "platformName": "Linux", "goog:chromeOptions": {"binary": "/usr/bin/google-chrome"}, "se:containerName": ""}'
max-sessions = 5
Appending Selenium option: --tracing false
Tracing is disabled
Selenium Grid Node configuration:
Setting up SE_NODE_GRID_URL...
Appending Selenium option: --session-timeout 300
Appending Selenium option: --heartbeat-period 30
Appending Selenium option: --log-level INFO
Appending Selenium option: --http-logs false
Appending Selenium option: --structured-logs false
Generating Selenium Config
2024-10-29 23:22:25,675 INFO success: selenium-node entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2024-10-29 23:22:25,626 INFO spawned: 'selenium-node' with pid 34
2024-10-29 23:22:25,625 INFO spawned: 'novnc' with pid 33
2024-10-29 23:22:25,622 INFO spawned: 'vnc' with pid 32
2024-10-29 23:22:25,620 INFO spawned: 'xvfb' with pid 31
2024-10-29 23:22:24,618 INFO RPC interface 'supervisor' initialized
2024-10-29 23:22:24,618 INFO supervisord started with pid 8
2024-10-29 23:22:24,614 INFO Included extra file "/etc/supervisor/conf.d/chrome-cleanup.conf" during parsing
2024-10-29 23:22:24,615 INFO Included extra file "/etc/supervisor/conf.d/selenium.conf" during parsing

Command used to start Selenium Grid with Docker (or Kubernetes), and the container "shutsdown" later.

1. the containers run on AWS ECS. Each has its own Load Balancer & DNS.

2. I am able to ping or call each container from the other for example, I can CURL the hub from the node and vice-versa.

Operating System

linux/ubuntu:20

Docker Selenium version (image tag)

4.25.0-20240922

Selenium Grid chart version (chart version)

No response

github-actions[bot] commented 3 weeks ago

@edison12a, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

VietND96 commented 3 weeks ago

Via env var SE_OPTS can you try to add SE_OPTS=--publish-events tcp://0.0.0.0:4442 --subscribe-events=tcp://0.0.0.0:4443 to Hub config?

edison12a commented 3 weeks ago

@VietND96

I got the error:

chrome-node   | Was passed main parameter '--subscribe-events=tcp://selenium-hub:4443' but no main parameter was defined in your arg class

but adding "SE_OPTS: "--hub http://selenium-hub:4444" did the trick. the chrome node now throws no connection error but still shutsdown after its connection attempts. The selenium hub still has all the errors.

VietND96 commented 2 weeks ago

If having time, can you share some details on the infrastructure that you deploy the container Which service in AWS, and how is the network configured to have an arbitrary address assigned to the container?