SeleniumHQ / docker-selenium

Provides a simple way to run Selenium Grid with Chrome, Firefox, and Edge using Docker, making it easier to perform browser automation
http://www.selenium.dev/docker-selenium/
Other
7.88k stars 2.51k forks source link

[🐛 Bug]: High CPU usage 100% after some hours running #1762

Closed brunojs02 closed 1 year ago

brunojs02 commented 1 year ago

What happened?

I'm running a container with image selenium/standalone-chrome:latest, and this container is used by other python app container. After some hours runing, the cpu usage increase to 100% and keep it until i restart the container

Screenshot from 2023-01-02 20-51-46

Command used to start Selenium Grid with Docker

I'm using docker-compose to run chrome selenium driver and my python app with the follwing script:

chrome:
    container_name: chrome
    image: selenium/standalone-chrome:latest
    hostname: chrome
    privileged: true
    shm_size: 2g
    environment:
      - SE_NODE_SESSION_TIMEOUT=4000

PS: i need the long timeout because my python application sleep for 1 hour with some conditions

Relevant log output

2023-01-02 12:16:01,257 INFO Included extra file "/etc/supervisor/conf.d/selenium.conf" during parsing
2023-01-02 12:16:01,266 INFO RPC interface 'supervisor' initialized
2023-01-02 12:16:01,266 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2023-01-02 12:16:01,267 INFO supervisord started with pid 8
2023-01-02 12:16:02,270 INFO spawned: 'xvfb' with pid 10
2023-01-02 12:16:02,273 INFO spawned: 'vnc' with pid 11
2023-01-02 12:16:02,275 INFO spawned: 'novnc' with pid 12
2023-01-02 12:16:02,282 INFO spawned: 'selenium-standalone' with pid 13
Setting up SE_NODE_GRID_URL...
2023-01-02 12:16:02,319 INFO success: xvfb entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2023-01-02 12:16:02,319 INFO success: vnc entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2023-01-02 12:16:02,319 INFO success: novnc entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2023-01-02 12:16:02,319 INFO success: selenium-standalone entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
Selenium Grid Standalone configuration: 
[network]
relax-checks = true

[node]
session-timeout = "4000"
override-max-sessions = false
detect-drivers = false
drain-after-session-count = 0
max-sessions = 1

[[node.driver-configuration]]
display-name = "chrome"
stereotype = '{"browserName": "chrome", "browserVersion": "108.0", "platformName": "Linux"}'
max-sessions = 1

Starting Selenium Grid Standalone...
Tracing is disabled
12:16:04.246 INFO [LoggingOptions.configureLogEncoding] - Using the system default encoding
12:16:04.270 INFO [OpenTelemetryTracer.createTracer] - Using OpenTelemetry for tracing
12:16:05.368 WARN [HostIdentifier.resolveHostAddress] - Failed to resolve host address
java.net.UnknownHostException: chrome: chrome: No address associated with hostname
        at java.base/java.net.InetAddress.getLocalHost(InetAddress.java:1656)
        at org.openqa.selenium.net.HostIdentifier.resolveHostAddress(HostIdentifier.java:105)
        at org.openqa.selenium.net.HostIdentifier.getHostAddress(HostIdentifier.java:124)
        at org.openqa.selenium.net.DefaultNetworkInterfaceProvider.<init>(DefaultNetworkInterfaceProvider.java:56)
        at org.openqa.selenium.net.NetworkUtils.<init>(NetworkUtils.java:49)
        at org.openqa.selenium.grid.server.BaseServerOptions.lambda$getExternalUri$0(BaseServerOptions.java:90)
        at java.base/java.util.Optional.orElseGet(Optional.java:369)
        at org.openqa.selenium.grid.server.BaseServerOptions.getExternalUri(BaseServerOptions.java:88)
        at org.openqa.selenium.grid.commands.Standalone.createHandlers(Standalone.java:128)
        at org.openqa.selenium.grid.TemplateGridServerCommand.asServer(TemplateGridServerCommand.java:41)
        at org.openqa.selenium.grid.commands.Standalone.execute(Standalone.java:245)
        at org.openqa.selenium.grid.TemplateGridCommand.lambda$configure$4(TemplateGridCommand.java:129)
        at org.openqa.selenium.grid.Main.launch(Main.java:83)
        at org.openqa.selenium.grid.Main.go(Main.java:57)
        at org.openqa.selenium.grid.Main.main(Main.java:42)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.openqa.selenium.grid.Bootstrap.runMain(Bootstrap.java:77)
        at org.openqa.selenium.grid.Bootstrap.main(Bootstrap.java:70)
Caused by: java.net.UnknownHostException: chrome: No address associated with hostname
        at java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
        at java.base/java.net.InetAddress$PlatformNameService.lookupAllHostAddr(InetAddress.java:929)
        at java.base/java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1529)
        at java.base/java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:848)
        at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1519)
        at java.base/java.net.InetAddress.getLocalHost(InetAddress.java:1651)
        ... 20 more
12:16:06.741 INFO [NodeOptions.getSessionFactories] - Detected 1 available processors
12:16:06.850 INFO [NodeOptions.report] - Adding chrome for {"browserVersion": "108.0","se:noVncPort": 7900,"browserName": "chrome","platformName": "LINUX","se:vncEnabled": true} 1 times
12:16:06.894 INFO [Node.<init>] - Binding additional locator mechanisms: id, relative, name
12:16:06.947 INFO [GridModel.setAvailability] - Switching Node d8b7e64e-5463-4f4c-9165-4f71ecfdc273 (uri: http://172.17.0.1:4444) from DOWN to UP
12:16:06.947 INFO [LocalDistributor.add] - Added node d8b7e64e-5463-4f4c-9165-4f71ecfdc273 at http://172.17.0.1:4444. Health check every 120s
12:16:07.391 INFO [Standalone.execute] - Started Selenium Standalone 4.7.2 (revision 4d4020c3b7): http://172.17.0.1:4444
12:16:08.886 INFO [LocalDistributor.newSession] - Session request received by the Distributor: 
 [Capabilities {browserName: chrome, goog:chromeOptions: {args: [--headless, --disable-gpu], extensions: []}, pageLoadStrategy: normal}]
Starting ChromeDriver 108.0.5359.71 (1e0e3868ee06e91ad636a874420e3ca3ae3756ac-refs/branch-heads/5359@{#1016}) on port 4252
Only local connections are allowed.
Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
ChromeDriver was started successfully.
12:16:10.511 INFO [LocalNode.newSession] - Session created by the Node. Id: 6a99c87cc4709560ee33297d9cdec6f5, Caps: Capabilities {acceptInsecureCerts: false, browserName: chrome, browserVersion: 108.0.5359.124, chrome: {chromedriverVersion: 108.0.5359.71 (1e0e3868ee06..., userDataDir: /tmp/.com.google.Chrome.srQoo1}, goog:chromeOptions: {debuggerAddress: localhost:42462}, networkConnectionEnabled: false, pageLoadStrategy: normal, platformName: LINUX, proxy: Proxy(), se:cdp: http://localhost:42462, se:cdpVersion: 108.0.5359.124, se:vncEnabled: true, se:vncLocalAddress: ws://172.17.0.1:7900, setWindowRect: true, strictFileInteractability: false, timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, unhandledPromptBehavior: dismiss and notify, webauthn:extension:credBlob: true, webauthn:extension:largeBlob: true, webauthn:virtualAuthenticators: true}
12:16:10.527 INFO [LocalDistributor.newSession] - Session created by the Distributor. Id: 6a99c87cc4709560ee33297d9cdec6f5 
 Caps: Capabilities {acceptInsecureCerts: false, browserName: chrome, browserVersion: 108.0.5359.124, chrome: {chromedriverVersion: 108.0.5359.71 (1e0e3868ee06..., userDataDir: /tmp/.com.google.Chrome.srQoo1}, goog:chromeOptions: {debuggerAddress: localhost:42462}, networkConnectionEnabled: false, pageLoadStrategy: normal, platformName: LINUX, proxy: Proxy(), se:bidiEnabled: false, se:cdp: ws://172.17.0.1:4444/sessio..., se:cdpVersion: 108.0.5359.124, se:vnc: ws://172.17.0.1:4444/sessio..., se:vncEnabled: true, se:vncLocalAddress: ws://172.17.0.1:7900, setWindowRect: true, strictFileInteractability: false, timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, unhandledPromptBehavior: dismiss and notify, webauthn:extension:credBlob: true, webauthn:extension:largeBlob: true, webauthn:virtualAuthenticators: true}
17:26:03.763 WARN [DefaultChannelPipeline.onUnhandledInboundException] - An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
java.lang.NullPointerException
        at org.openqa.selenium.netty.server.RequestConverter.channelRead0(RequestConverter.java:131)
        at org.openqa.selenium.netty.server.RequestConverter.channelRead0(RequestConverter.java:55)
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:102)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:102)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:93)
        at org.openqa.selenium.netty.server.WebSocketUpgradeHandler.channelRead(WebSocketUpgradeHandler.java:100)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:93)
        at io.netty.handler.codec.http.websocketx.extensions.WebSocketServerExtensionHandler.channelRead(WebSocketServerExtensionHandler.java:99)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:93)
        at io.netty.handler.codec.http.HttpServerKeepAliveHandler.channelRead(HttpServerKeepAliveHandler.java:64)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:336)
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:323)
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:444)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:280)
        at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:829)
17:26:04.079 WARN [DefaultChannelPipeline.onUnhandledInboundException] - An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
java.lang.NullPointerException
        at org.openqa.selenium.netty.server.RequestConverter.channelRead0(RequestConverter.java:131)
        at org.openqa.selenium.netty.server.RequestConverter.channelRead0(RequestConverter.java:55)
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:102)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:102)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:93)
        at org.openqa.selenium.netty.server.WebSocketUpgradeHandler.channelRead(WebSocketUpgradeHandler.java:100)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:93)
        at io.netty.handler.codec.http.websocketx.extensions.WebSocketServerExtensionHandler.channelRead(WebSocketServerExtensionHandler.java:99)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:93)
        at io.netty.handler.codec.http.HttpServerKeepAliveHandler.channelRead(HttpServerKeepAliveHandler.java:64)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:336)
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:323)
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:444)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:280)
        at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:829)

Operating System

CentOS 7 - Digital Ocean

Docker Selenium version (tag)

latest

diemol commented 1 year ago

How can we reproduce the issue?

brunojs02 commented 1 year ago

How can we reproduce the issue?

You can use this project https://github.com/brunojs02/visa_rescheduler/tree/issue-selenium at branch issue-selenium

diemol commented 1 year ago

I started the script and so far I do not see any high usage that remains for that long. Have you analyzed the running processes to determine which one is using all the CPU?

brunojs02 commented 1 year ago

for how many hours did you run the application?

diemol commented 1 year ago

About 1.5 hours. Did you analyze the processes and found the one who is consuming the most?

brunojs02 commented 1 year ago

I think 1.5 hours isn’t enough, some scenarios the time that cpu increased was higher than 5 hours. And yes, i had analyzed and the selenium container was responsible for the cpu usage, the python didn’t consume much cpu

diemol commented 1 year ago

Right, but the Selenium container has several processes running, and since I cannot reproduce the issue, hints from your side pointing the process inside the container that is using lots of CPU would be helpful to triage this issue. Thank you.

brunojs02 commented 1 year ago

Ok, I’ll run it again and after that, I return with more infos

brunojs02 commented 1 year ago

I found the process with highest cpu usage, its x11vnc with ≅ 90% Screenshot from 2023-01-19 13-02-06

diemol commented 1 year ago

I guess the alternative would be to create an environment variable to disable VNC in the containers. With that Xvfb would still run but VNC won't be started.

Is that something that would work for you? If so, do you think you can help us implement that?

brunojs02 commented 1 year ago

I would love to contribute to disable VNC startup, but i have no idea how to.. can you describe how these images works?

github-actions[bot] commented 9 months ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.