[🐛 Bug]: Session not being created sometimes when running parallel tests through selenium grid.

kjjnygres commented 1 month ago

What happened?

Context:

I am trying to run parallel tests with the help of selenium grid. My grid and nodes all are using same machine.

Currently what is being done is when I run a test suite, a docker container gets spinned up for that suite, execute the test, and then destroys itself when the suite execution is complete. Every docker container here is a node, connected with the hub.

Which container/node should run which test suite is managed by desired capabilities:

SE_NODE_STEREOTYPE='{"browserName":"chrome","browserVersion":\"122.0\","goog:chromeOptions":{"binary":"/usr/bin/google-chrome"},"platformName":"linux","se:noVncPort":7900,"se:vncEnabled":true, "nodename:applicationName":"Container1"}

Problem:

Sometimes the suites execute just fine; sometimes one of the suites get failed; sometimes none of the test starts; giving below error:

"Driver info: driver.version: unknown

Stacktrace:

at org.openqa.selenium.grid.sessionqueue.local.LocalNewSessionQueue.addToQueue (LocalNewSessionQueue.java:221)

at org.openqa.selenium.grid.sessionqueue.NewSessionQueue.lambda$new$0 (NewSessionQueue.java:68)

at org.openqa.selenium.remote.http.Route$TemplatizedRoute.handle (Route.java:192)

at org.openqa.selenium.remote.http.Route.execute (Route.java:69)

at org.openqa.selenium.remote.http.Route$CombinedRoute.handle (Route.java:360)

at org.openqa.selenium.remote.http.Route.execute (Route.java:69)

at org.openqa.selenium.grid.sessionqueue.NewSessionQueue.execute (NewSessionQueue.java:128)

at org.openqa.selenium.remote.tracing.SpanWrappedHttpHandler.execute (SpanWrappedHttpHandler.java:87)

at org.openqa.selenium.remote.http.Filter$1.execute (Filter.java:63)

at org.openqa.selenium.remote.http.Route$CombinedRoute.handle (Route.java:360)

at org.openqa.selenium.remote.http.Route.execute (Route.java:69)

at org.openqa.selenium.grid.router.Router.execute (Router.java:87)

at org.openqa.selenium.grid.web.EnsureSpecCompliantResponseHeaders.lambda$apply$0 (EnsureSpecCompliantResponseHeaders.java:34)

at org.openqa.selenium.remote.http.Filter$1.execute (Filter.java:63)

at org.openqa.selenium.remote.http.Route$CombinedRoute.handle (Route.java:360)

at org.openqa.selenium.remote.http.Route.execute (Route.java:69)

at org.openqa.selenium.remote.AddWebDriverSpecHeaders.lambda$apply$0 (AddWebDriverSpecHeaders.java:35)

at org.openqa.selenium.remote.ErrorFilter.lambda$apply$0 (ErrorFilter.java:44)

at org.openqa.selenium.remote.http.Filter$1.execute (Filter.java:63)

at org.openqa.selenium.remote.ErrorFilter.lambda$apply$0 (ErrorFilter.java:44)

at org.openqa.selenium.remote.http.Filter$1.execute (Filter.java:63)

at org.openqa.selenium.netty.server.SeleniumHandler.lambda$channelRead0$0 (SeleniumHandler.java:44)

at java.util.concurrent.Executors$RunnableAdapter.call (None:-1)

at java.util.concurrent.FutureTask.run (None:-1)

at java.util.concurrent.ThreadPoolExecutor.runWorker (None:-1)

at java.util.concurrent.ThreadPoolExecutor$Worker.run (None:-1)

at java.lang.Thread.run (None:-1)"

System info:

os.name: 'Linux', os.arch: 'amd64', os.version: '5.14.0-427.22.1.el9_4.x86_64', java.version: '17.0.11'

Command used to start Selenium Grid with Docker (or Kubernetes)

docker run -d --net grid --name ${job} -e  SE_EVENT_BUS_HOST=selenium-hub -e SE_NODE_STEREOTYPE='{"browserName":"chrome","browserVersion":"122.0","goog:chromeOptions":{"binary":"/usr/bin/google-chrome"},"platformName":"linux","se:noVncPort":7900,"se:vncEnabled":true, "nodename:applicationName":\"${job}"}' --cap-add=CAP_AUDIT_WRITE   --shm-size="2g"     -e SE_EVENT_BUS_PUBLISH_PORT=4442     -e SE_EVENT_BUS_SUBSCRIBE_PORT=4443     localhost/node-image-xx

Relevant log output

HUB LOGS:

06:58:00.297 DEBUG [HttpClientImpl$SelectorManager.run] - [HttpClient-30-SelectorManager] [242859s 937ms] HttpClientImpl(30) next timeout: 0
06:58:00.297 DEBUG [HttpClientImpl$SelectorManager.run] - [HttpClient-30-SelectorManager] [242859s 937ms] HttpClientImpl(30) next expired: 0
06:58:00.297 DEBUG [HttpClientImpl$SelectorManager.run] - [HttpClient-30-SelectorManager] [242859s 937ms] HttpClientImpl(30) Next deadline is 3000
06:58:00.297 DEBUG [HttpClientImpl$SelectorManager.run] - [HttpClient-25-SelectorManager] [242859s 937ms] HttpClientImpl(25) next timeout: 0
06:58:00.297 DEBUG [HttpClientImpl$SelectorManager.run] - [HttpClient-25-SelectorManager] [242859s 937ms] HttpClientImpl(25) next expired: 0
06:58:00.297 DEBUG [HttpClientImpl$SelectorManager.run] - [HttpClient-25-SelectorManager] [242859s 937ms] HttpClientImpl(25) Next deadline is 3000
06:58:00.460 DEBUG [HttpClientImpl$SelectorAttachment.register] - [HttpClient-39-SelectorManager] [242860s 100ms] SelectorAttachment Registering jdk.internal.net.http.SocketTube$InternalReadPublisher$ReadEvent@4dc99526 for 0 (false)
06:58:00.460 DEBUG [SocketTube$InternalReadPublisher$InternalReadSubscription.read] - [HttpClient-39-SelectorManager] [242860s 100ms] SocketTube(223) read bytes: 125
06:58:00.460 DEBUG [Http1AsyncReceiver.asyncReceive] - [HttpClient-39-SelectorManager] [242860s 100ms] Http1AsyncReceiver(SocketTube(223)) Putting 125 bytes into the queue
06:58:00.460 DEBUG [SocketTube$InternalReadPublisher$InternalReadSubscription.resumeReadEvent] - [HttpClient-39-SelectorManager] [242860s 100ms] SocketTube(223) resuming read event
06:58:00.460 DEBUG [HttpClientImpl$SelectorAttachment.register] - [HttpClient-39-SelectorManager] [242860s 100ms] SelectorAttachment Registering jdk.internal.net.http.SocketTube$InternalReadPublisher$ReadEvent@4dc99526 for 1 (false)
06:58:00.460 DEBUG [SocketTube.debugState] - [HttpClient-39-SelectorManager] [242860s 100ms] SocketTube(223) leaving read() loop after onNext:  Reading: [ops=1, demand=0, stopped=false], Writing: [ops=0, demand=1]
06:58:00.460 DEBUG [HttpClientImpl$SelectorManager.run] - [HttpClient-39-SelectorManager] [242860s 100ms] HttpClientImpl(39) next timeout: 177753
06:58:00.460 DEBUG [HttpClientImpl$SelectorManager.run] - [HttpClient-39-SelectorManager] [242860s 100ms] HttpClientImpl(39) next expired: 0
06:58:00.460 DEBUG [HttpClientImpl$SelectorManager.run] - [HttpClient-39-SelectorManager] [242860s 100ms] HttpClientImpl(39) Next deadline is 3000
06:58:00.460 DEBUG [Http1AsyncReceiver.flush] - [JdkHttpClient-38-39] [242860s 100ms] Http1AsyncReceiver(SocketTube(223)) Got 125 bytes for delegate jdk.internal.net.http.Http1Response$HeadersReader@6bf3a392
06:58:00.460 DEBUG [Http1AsyncReceiver.hasDemand] - [JdkHttpClient-38-39] [242860s 100ms] Http1AsyncReceiver(SocketTube(223)) downstream subscription demand is 1
06:58:00.460 DEBUG [Http1AsyncReceiver.flush] - [JdkHttpClient-38-39] [242860s 100ms] Http1AsyncReceiver(SocketTube(223)) Forwarding 125 bytes to delegate jdk.internal.net.http.Http1Response$HeadersReader@6bf3a392
06:58:00.460 DEBUG [Http1Response$HeadersReader.handle] - [JdkHttpClient-38-39] [242860s 100ms] Http1Response(id=71309, PlainHttpConnection(SocketTube(223))) Sending 125/125 bytes to header parser
06:58:00.460 DEBUG [Http1Response$HeadersReader.handle] - [JdkHttpClient-38-39] [242860s 100ms] Http1Response(id=71309, PlainHttpConnection(SocketTube(223))) Parsing headers completed. bytes=111
06:58:00.461 DEBUG [Http1AsyncReceiver.unsubscribe] - [JdkHttpClient-38-39] [242860s 100ms] Http1AsyncReceiver(SocketTube(223)) Unsubscribed jdk.internal.net.http.Http1Response$HeadersReader@6bf3a392
06:58:00.461 DEBUG [Http1Response.lambda$readHeadersAsync$0] - [JdkHttpClient-38-39] [242860s 100ms] Http1Response(id=71309, PlainHttpConnection(SocketTube(223))) Reading Headers: creating Response object; state is now READING_BODY
06:58:00.461 DEBUG [Utils.lambda$wrapForDebug$9] - [JdkHttpClient-38-39] [242860s 100ms] Http1Exchange getResponseAsync completed successfully
06:58:00.461 DEBUG [Http1Response.readBody] - [JdkHttpClient-38-39] [242860s 100ms] Http1Response(id=71309, PlainHttpConnection(SocketTube(223))) readBody: return2Cache: true
06:58:00.461 DEBUG [Http1Response$ClientRefCountTracker.acquire] - [JdkHttpClient-38-39] [242860s 100ms] Http1Response(id=71309, PlainHttpConnection(SocketTube(223))) Operation started: incrementing ref count for jdk.internal.net.http.HttpClientImpl@677ef72b(39)
06:58:00.461 DEBUG [Http1AsyncReceiver.subscribe] - [JdkHttpClient-38-39] [242860s 100ms] Http1AsyncReceiver(SocketTube(223)) Subscribed pending jdk.internal.net.http.Http1Response$BodyReader@df9a2e7/parser=jdk.internal.net.http.ResponseContent$FixedLengthBodyParser@316ea0dc queue.isEmpty: false
06:58:00.461 DEBUG [Http1AsyncReceiver.flush] - [JdkHttpClient-38-39] [242860s 100ms] Http1AsyncReceiver(SocketTube(223)) Delegate done: 14
06:58:00.461 DEBUG [ResponseContent$FixedLengthBodyParser.onSubscribe] - [JdkHttpClient-38-39] [242860s 101ms] PlainHttpConnection(SocketTube(223))/ResponseContent/FixedLengthBodyParser length=14, onSubscribe: jdk.internal.net.http.Http1Response$Http1BodySubscriber
06:58:00.461 DEBUG [Http1AsyncReceiver.handlePendingDelegate] - [JdkHttpClient-38-39] [242860s 101ms] Http1AsyncReceiver(SocketTube(223)) delegate is now jdk.internal.net.http.Http1Response$BodyReader@df9a2e7/parser=jdk.internal.net.http.ResponseContent$FixedLengthBodyParser@316ea0dc, demand=9223372036854775807, canRequestMore=false, queue.isEmpty=false
06:58:00.461 DEBUG [Http1AsyncReceiver.flush] - [JdkHttpClient-38-39] [242860s 101ms] Http1AsyncReceiver(SocketTube(223)) Got 14 bytes for delegate jdk.internal.net.http.Http1Response$BodyReader@df9a2e7/parser=jdk.internal.net.http.ResponseContent$FixedLengthBodyParser@316ea0dc
06:58:00.461 DEBUG [Http1AsyncReceiver.hasDemand] - [JdkHttpClient-38-39] [242860s 101ms] Http1AsyncReceiver(SocketTube(223)) downstream subscription demand is 9223372036854775807
06:58:00.461 DEBUG [Http1AsyncReceiver.flush] - [JdkHttpClient-38-39] [242860s 101ms] Http1AsyncReceiver(SocketTube(223)) Forwarding 14 bytes to delegate jdk.internal.net.http.Http1Response$BodyReader@df9a2e7/parser=jdk.internal.net.http.ResponseContent$FixedLengthBodyParser@316ea0dc
06:58:00.461 DEBUG [Http1Response$BodyReader.handle] - [JdkHttpClient-38-39] [242860s 101ms] Http1Response(id=71309, PlainHttpConnection(SocketTube(223))) Sending 14/125 bytes to body parser
06:58:00.461 DEBUG [ResponseContent$FixedLengthBodyParser.accept] - [JdkHttpClient-38-39] [242860s 101ms] PlainHttpConnection(SocketTube(223))/ResponseContent/FixedLengthBodyParser Parser got 14 bytes (14 remaining / 14)
06:58:00.461 DEBUG [ResponseContent$FixedLengthBodyParser.accept] - [JdkHttpClient-38-39] [242860s 101ms] PlainHttpConnection(SocketTube(223))/ResponseContent/FixedLengthBodyParser Parser got all expected bytes: completing
06:58:00.461 DEBUG [Http1AsyncReceiver.clear] - [JdkHttpClient-38-39] [242860s 101ms] Http1AsyncReceiver(SocketTube(223)) cleared
06:58:00.461 DEBUG [Http1Response.onFinished] - [JdkHttpClient-38-39] [242860s 101ms] Http1Response(id=71309, PlainHttpConnection(SocketTube(223))) SocketTube(223): return to HTTP/1.1 pool
06:58:00.461 DEBUG [ConnectionPool.registerCleanupTrigger] - [JdkHttpClient-38-39] [242860s 101ms] ConnectionPool(39) registering CleanupTrigger(SocketTube(223))
06:58:00.461 DEBUG [SocketTube.connectFlows] - [JdkHttpClient-38-39] [242860s 101ms] SocketTube(223) connecting flows
06:58:00.461 DEBUG [SocketTube$InternalReadPublisher.subscribe] - [JdkHttpClient-38-39] [242860s 101ms] SocketTube(223) read publisher got subscriber
06:58:00.461 DEBUG [SocketTube$InternalReadPublisher$InternalReadSubscription.signalSubscribe] - [JdkHttpClient-38-39] [242860s 101ms] SocketTube(223) registering subscribe event
06:58:00.461 DEBUG [SocketTube.debugState] - [JdkHttpClient-38-39] [242860s 101ms] SocketTube(223) leaving read.subscribe:  Reading: [ops=1, demand=0, stopped=false], Writing: [ops=0, demand=1]
06:58:00.461 DEBUG [SocketTube$InternalWriteSubscriber.onSubscribe] - [JdkHttpClient-38-39] [242860s 101ms] SocketTube(223) subscribed for writing
06:58:00.461 DEBUG [SocketTube$InternalWriteSubscriber$WriteSubscription.dropSubscription] - [JdkHttpClient-38-39] [242860s 101ms] SocketTube(223) write: resetting demand to 0
06:58:00.461 DEBUG [SocketTube$InternalWriteSubscriber.onSubscribe] - [JdkHttpClient-38-39] [242860s 101ms] SocketTube(223) write: registering startSubscription event
06:58:00.461 DEBUG [JdkHttpClient.execute0] - Ending request (POST) /session/12ec27cf49981e66be488ce1e9608fce/element/f.BE7172DF69F9BC3578AD01D462D126DD.d.356E9239A2C37D687BE9F284C5BAB6FC.e.123/click in 2,248ms
06:58:00.461 DEBUG [Http1Response.lambda$readBody$2] - [JdkHttpClient-38-39] [242860s 101ms] Http1Response(id=71309, PlainHttpConnection(SocketTube(223))) Finished reading body: READING_BODY
06:58:00.461 DEBUG [Http1Response$ClientRefCountTracker.tryRelease] - [JdkHttpClient-38-39] [242860s 101ms] Http1Response(id=71309, PlainHttpConnection(SocketTube(223))) Operation finished: decrementing ref count for jdk.internal.net.http.HttpClientImpl@677ef72b(39)
06:58:00.461 DEBUG [Http1AsyncReceiver.flush] - [JdkHttpClient-38-39] [242860s 101ms] Http1AsyncReceiver(SocketTube(223)) Delegate done: 0
06:58:00.461 DEBUG [Http1AsyncReceiver.flush] - [JdkHttpClient-38-39] [242860s 101ms] Http1AsyncReceiver(SocketTube(223)) Got 0 bytes for delegate null
06:58:00.461 DEBUG [SocketTube$InternalReadPublisher$InternalReadSubscription.handleSubscribeEvent] - [HttpClient-39-SelectorManager] [242860s 101ms] SocketTube(223) subscribe event raised
06:58:00.461 DEBUG [SocketTube$InternalReadPublisher$InternalReadSubscription.handlePending] - [HttpClient-39-SelectorManager] [242860s 101ms] SocketTube(223) handling pending subscription for CleanupTrigger(SocketTube(223))
06:58:00.461 DEBUG [Http1AsyncReceiver$Http1TubeSubscriber.dropSubscription] - [HttpClient-39-SelectorManager] [242860s 101ms] Http1AsyncReceiver(SocketTube(223)) Http1TubeSubscriber: dropSubscription
06:58:00.462 DEBUG [SocketTube$InternalReadPublisher$InternalReadSubscription.handlePending] - [HttpClient-39-SelectorManager] [242860s 101ms] SocketTube(223) read demand reset to 0
06:58:00.462 DEBUG [SocketTube$InternalReadPublisher$InternalReadSubscription.handlePending] - [HttpClient-39-SelectorManager] [242860s 101ms] SocketTube(223) calling onSubscribe
06:58:00.462 DEBUG [SocketTube$InternalReadPublisher$InternalReadSubscription.request] - [HttpClient-39-SelectorManager] [242860s 101ms] SocketTube(223) got some demand for reading
06:58:00.462 DEBUG [SocketTube$InternalReadPublisher$InternalReadSubscription.resumeReadEvent] - [HttpClient-39-SelectorManager] [242860s 101ms] SocketTube(223) resuming read event
06:58:00.462 DEBUG [HttpClientImpl$SelectorAttachment.register] - [HttpClient-39-SelectorManager] [242860s 101ms] SelectorAttachment Registering jdk.internal.net.http.SocketTube$InternalReadPublisher$ReadEvent@4dc99526 for 1 (false)
06:58:00.462 DEBUG [SocketTube.debugState] - [HttpClient-39-SelectorManager] [242860s 101ms] SocketTube(223) leaving request(1):  Reading: [ops=1, demand=1, stopped=false], Writing: [ops=0, demand=0]
06:58:00.462 DEBUG [SocketTube$InternalReadPublisher$ReadSubscription.signalOnSubscribe] - [HttpClient-39-SelectorManager] [242860s 101ms] SocketTube(223) onSubscribe called
06:58:00.462 DEBUG [SocketTube$InternalReadPublisher$InternalReadSubscription.read] - [HttpClient-39-SelectorManager] [242860s 101ms] SocketTube(223) pending subscriber subscribed
06:58:00.462 DEBUG [SocketTube$InternalWriteSubscriber.startSubscription] - [HttpClient-39-SelectorManager] [242860s 101ms] SocketTube(223) write: starting subscription
06:58:00.462 DEBUG [SocketTube$InternalWriteSubscriber.startSubscription] - [HttpClient-39-SelectorManager] [242860s 101ms] SocketTube(223) write: offloading requestMore
06:58:00.462 DEBUG [HttpClientImpl$SelectorManager.run] - [HttpClient-39-SelectorManager] [242860s 102ms] HttpClientImpl(39) next timeout: 0
06:58:00.462 DEBUG [HttpClientImpl$SelectorManager.run] - [HttpClient-39-SelectorManager] [242860s 102ms] HttpClientImpl(39) next expired: 1199538
06:58:00.462 DEBUG [HttpClientImpl$SelectorManager.run] - [HttpClient-39-SelectorManager] [242860s 102ms] HttpClientImpl(39) Next deadline is 3000
06:58:00.461 DEBUG [SeleniumSpanExporter$1.lambda$export$4] - SpanData{spanContext=ImmutableSpanContext{traceId=6850bf803a94dc9ad91137cd4ff3d578, spanId=7981274a2ee559e2, traceFlags=01, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=true}, parentSpanContext=ImmutableSpanContext{traceId=6850bf803a94dc9ad91137cd4ff3d578, spanId=d8151636ec50488d, traceFlags=01, traceState=ArrayBasedTraceState{entries=[]}, remote=true, valid=true}, resource=Resource{schemaUrl=null, attributes={service.name="unknown_service:java", telemetry.sdk.language="java", telemetry.sdk.name="opentelemetry", telemetry.sdk.version="1.39.0"}}, instrumentationScopeInfo=InstrumentationScopeInfo{name=default, version=null, schemaUrl=null, attributes={}}, name=httpclient.execute, kind=INTERNAL, startEpochNanos=1721285878213717514, endEpochNanos=1721285880461765059, attributes=AttributesMap{data={span.kind=client, http.target=/session/12ec27cf49981e66be488ce1e9608fce/element/f.BE7172DF69F9BC3578AD01D462D126DD.d.356E9239A2C37D687BE9F284C5BAB6FC.e.123/click, http.status_code=200, http.method=POST}, capacity=128, totalAddedValues=4}, totalAttributeCount=4, events=[], totalRecordedEvents=0, links=[], totalRecordedLinks=0, status=ImmutableStatusData{statusCode=OK, description=Kind: OK Description:}, hasEnded=true}
06:58:00.462 DEBUG [SocketTube$InternalWriteSubscriber$WriteSubscription.requestMore] - [JdkHttpClient-38-39] [242860s 102ms] SocketTube(223) write: requesting more...
06:58:00.462 DEBUG [SocketTube.debugState] - [JdkHttpClient-38-39] [242860s 102ms] SocketTube(223) leaving requestMore:  Reading: [ops=1, demand=1, stopped=false], Writing: [ops=0, demand=1]
06:58:00.462 DEBUG [SeleniumSpanExporter$1.lambda$export$4] - SpanData{spanContext=ImmutableSpanContext{traceId=6850bf803a94dc9ad91137cd4ff3d578, spanId=b4392792f3ee386a, traceFlags=01, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=true}, parentSpanContext=ImmutableSpanContext{traceId=6850bf803a94dc9ad91137cd4ff3d578, spanId=d8151636ec50488d, traceFlags=01, traceState=ArrayBasedTraceState{entries=[]}, remote=true, valid=true}, resource=Resource{schemaUrl=null, attributes={service.name="unknown_service:java", telemetry.sdk.language="java", telemetry.sdk.name="opentelemetry", telemetry.sdk.version="1.39.0"}}, instrumentationScopeInfo=InstrumentationScopeInfo{name=default, version=null, schemaUrl=null, attributes={}}, name=reverse_proxy, kind=INTERNAL, startEpochNanos=1721285878213707345, endEpochNanos=1721285880462442638, attributes=AttributesMap{data={http.target=/session/12ec27cf49981e66be488ce1e9608fce/element/f.BE7172DF69F9BC3578AD01D462D126DD.d.356E9239A2C37D687BE9F284C5BAB6FC.e.123/click, http.status_code=200, http.method=POST}, capacity=128, totalAddedValues=3}, totalAttributeCount=3, events=[], totalRecordedEvents=0, links=[], totalRecordedLinks=0, status=ImmutableStatusData{statusCode=OK, description=Kind: OK Description:}, hasEnded=true}
06:58:00.462 DEBUG [SeleniumSpanExporter$1.lambda$export$4] - SpanData{spanContext=ImmutableSpanContext{traceId=6850bf803a94dc9ad91137cd4ff3d578, spanId=d8151636ec50488d, traceFlags=01, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=true}, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, resource=Resource{schemaUrl=null, attributes={service.name="unknown_service:java", telemetry.sdk.language="java", telemetry.sdk.name="opentelemetry", telemetry.sdk.version="1.39.0"}}, instrumentationScopeInfo=InstrumentationScopeInfo{name=default, version=null, schemaUrl=null, attributes={}}, name=router.handle_session, kind=INTERNAL, startEpochNanos=1721285878213562283, endEpochNanos=1721285880462577352, attributes=AttributesMap{data={http.target=/session/12ec27cf49981e66be488ce1e9608fce/element/f.BE7172DF69F9BC3578AD01D462D126DD.d.356E9239A2C37D687BE9F284C5BAB6FC.e.123/click, http.status_code=200, session.id=12ec27cf49981e66be488ce1e9608fce, http.method=POST}, capacity=128, totalAddedValues=4}, totalAttributeCount=4, events=[], totalRecordedEvents=0, links=[], totalRecordedLinks=0, status=ImmutableStatusData{statusCode=OK, description=Kind: OK Description:}, hasEnded=true}
06:58:00.463 DEBUG [RequestConverter.channelRead0] - Incoming message: DefaultHttpRequest(decodeResult: success, version: HTTP/1.1)
GET /session/12ec27cf49981e66be488ce1e9608fce/alert/text HTTP/1.1

NODE LOGS:

2024-07-18 06:59:04,506 INFO Included extra file "/etc/supervisor/conf.d/chrome-cleanup.conf" during parsing
2024-07-18 06:59:04,506 INFO Included extra file "/etc/supervisor/conf.d/selenium.conf" during parsing
2024-07-18 06:59:04,508 INFO RPC interface 'supervisor' initialized
2024-07-18 06:59:04,508 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2024-07-18 06:59:04,509 INFO supervisord started with pid 9
2024-07-18 06:59:05,511 INFO spawned: 'xvfb' with pid 17
2024-07-18 06:59:05,513 INFO spawned: 'vnc' with pid 18
2024-07-18 06:59:05,518 INFO spawned: 'novnc' with pid 20
2024-07-18 06:59:05,522 INFO spawned: 'selenium-node' with pid 25
2024-07-18 06:59:05,532 INFO success: selenium-node entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
E: [pulseaudio] main.c: Daemon startup failed.
No PulseAudio daemon running, or not running as session daemon.
No PulseAudio daemon running, or not running as session daemon.
No PulseAudio daemon running, or not running as session daemon.
Appending Selenium options: --log-level FINE
Appending Selenium options: --session-timeout 300
Appending Selenium options: --heartbeat-period 30
Generating Selenium Config
Configuring server...
Setting up SE_NODE_HOST...
Setting up SE_NODE_PORT...
Setting up SE_NODE_GRID_URL...
Tracing is disabled
Selenium Grid Node configuration: 
[events]
publish = "tcp://selenium-hub:4442"
subscribe = "tcp://selenium-hub:4443"

[node]
session-timeout = "300"
override-max-sessions = false
detect-drivers = false
drain-after-session-count = 0
max-sessions = 1

[[node.driver-configuration]]
display-name = "chrome"
stereotype = '{"browserName":"chrome","browserVersion":"122.0","goog:chromeOptions":{"binary":"/usr/bin/google-chrome"},"platformName":"linux","se:noVncPort":7900,"se:vncEnabled":true, "nodename:applicationName":"AccountCreation1"}'
max-sessions = 1

Starting Selenium Grid Node...
2024-07-18 06:59:06,605 INFO success: xvfb entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2024-07-18 06:59:06,605 INFO success: vnc entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2024-07-18 06:59:06,606 INFO success: novnc entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
06:59:06.914 INFO [LoggingOptions.configureLogEncoding] - Using the system default encoding
06:59:06.922 INFO [OpenTelemetryTracer.createTracer] - Using OpenTelemetry for tracing
06:59:07.022 DEBUG [JavaVersionSpecific.<clinit>] - Using the APIs optimized for: Java 9+
06:59:07.170 INFO [UnboundZmqEventBus.<init>] - Connecting to tcp://selenium-hub:4442 and tcp://selenium-hub:4443
06:59:07.267 INFO [UnboundZmqEventBus.<init>] - Sockets created
06:59:08.275 INFO [UnboundZmqEventBus.<init>] - Event bus ready
06:59:08.356 INFO [NodeServer.createHandlers] - Reporting self as: http://172.19.0.3:5555
06:59:08.371 INFO [NodeOptions.getSessionFactories] - Detected 4 available processors
06:59:08.408 INFO [NodeOptions.report] - Adding chrome for {"browserName": "chrome","browserVersion": "122.0","goog:chromeOptions": {"binary": "\u002fusr\u002fbin\u002fgoogle-chrome"},"nodename:applicationName": "AccountCreation1","platformName": "linux","se:noVncPort": 7900,"se:vncEnabled": true} 1 times
06:59:08.426 INFO [Node.<init>] - Binding additional locator mechanisms: relative
06:59:08.467 DEBUG [MultithreadEventLoopGroup.<clinit>] - -Dio.netty.eventLoopThreads: 8
06:59:08.472 DEBUG [GlobalEventExecutor.<clinit>] - -Dio.netty.globalEventExecutor.quietPeriodSeconds: 1
06:59:08.476 DEBUG [InternalThreadLocalMap.<clinit>] - -Dio.netty.threadLocalMap.stringBuilder.initialSize: 1024
06:59:08.476 DEBUG [InternalThreadLocalMap.<clinit>] - -Dio.netty.threadLocalMap.stringBuilder.maxSize: 4096
06:59:08.499 DEBUG [PlatformDependent0.explicitNoUnsafeCause0] - -Dio.netty.noUnsafe: false
06:59:08.500 DEBUG [PlatformDependent0.javaVersion0] - Java version: 17
06:59:08.501 DEBUG [PlatformDependent0.<clinit>] - sun.misc.Unsafe.theUnsafe: available
06:59:08.502 DEBUG [PlatformDependent0.<clinit>] - sun.misc.Unsafe base methods: all available
06:59:08.502 DEBUG [PlatformDependent0.<clinit>] - sun.misc.Unsafe.storeFence: available
06:59:08.503 DEBUG [PlatformDependent0.<clinit>] - java.nio.Buffer.address: available
06:59:08.503 DEBUG [PlatformDependent0.<clinit>] - direct buffer constructor: unavailable: Reflective setAccessible(true) disabled
06:59:08.504 DEBUG [PlatformDependent0.<clinit>] - java.nio.Bits.unaligned: available, true
06:59:08.505 DEBUG [PlatformDependent0.<clinit>] - jdk.internal.misc.Unsafe.allocateUninitializedArray(int): unavailable: class io.netty.util.internal.PlatformDependent0$7 cannot access class jdk.internal.misc.Unsafe (in module java.base) because module java.base does not export jdk.internal.misc to unnamed module @3130e399
06:59:08.506 DEBUG [PlatformDependent0.<clinit>] - java.nio.DirectByteBuffer.<init>(long, {int,long}): unavailable
06:59:08.506 DEBUG [PlatformDependent.unsafeUnavailabilityCause0] - sun.misc.Unsafe: available
06:59:08.506 DEBUG [PlatformDependent.tmpdir0] - -Dio.netty.tmpdir: /tmp (java.io.tmpdir)
06:59:08.506 DEBUG [PlatformDependent.bitMode0] - -Dio.netty.bitMode: 64 (sun.arch.data.model)
06:59:08.507 DEBUG [PlatformDependent.<clinit>] - -Dio.netty.maxDirectMemory: -1 bytes
06:59:08.507 DEBUG [PlatformDependent.<clinit>] - -Dio.netty.uninitializedArrayAllocationThreshold: -1
06:59:08.508 DEBUG [CleanerJava9.<clinit>] - java.nio.ByteBuffer.cleaner(): available
06:59:08.508 DEBUG [PlatformDependent.<clinit>] - -Dio.netty.noPreferDirect: false
06:59:08.509 DEBUG [NioEventLoop.<clinit>] - -Dio.netty.noKeySetOptimization: false
06:59:08.509 DEBUG [NioEventLoop.<clinit>] - -Dio.netty.selectorAutoRebuildThreshold: 512
06:59:08.513 DEBUG [PlatformDependent$Mpsc.<clinit>] - org.jctools-core.MpscChunkedArrayQueue: available
06:59:08.540 DEBUG [DefaultChannelId.<clinit>] - -Dio.netty.processId: 70 (auto-detected)
06:59:08.545 DEBUG [NetUtil.<clinit>] - -Djava.net.preferIPv4Stack: false
06:59:08.546 DEBUG [NetUtil.<clinit>] - -Djava.net.preferIPv6Addresses: false
06:59:08.547 DEBUG [NetUtilInitializations.determineLoopback] - Loopback interface: lo (lo, 0:0:0:0:0:0:0:1%lo)
06:59:08.548 DEBUG [NetUtil$SoMaxConnAction.run] - /proc/sys/net/core/somaxconn: 4096
06:59:08.549 DEBUG [DefaultChannelId.<clinit>] - -Dio.netty.machineId: 02:42:ac:ff:fe:13:00:03 (auto-detected)
06:59:08.558 DEBUG [ResourceLeakDetector.<clinit>] - -Dio.netty.leakDetection.level: simple
06:59:08.560 DEBUG [ResourceLeakDetector.<clinit>] - -Dio.netty.leakDetection.targetRecords: 4
06:59:08.600 DEBUG [PooledByteBufAllocator.<clinit>] - -Dio.netty.allocator.numHeapArenas: 8
06:59:08.600 DEBUG [PooledByteBufAllocator.<clinit>] - -Dio.netty.allocator.numDirectArenas: 8
06:59:08.600 DEBUG [PooledByteBufAllocator.<clinit>] - -Dio.netty.allocator.pageSize: 8192
06:59:08.601 DEBUG [PooledByteBufAllocator.<clinit>] - -Dio.netty.allocator.maxOrder: 9
06:59:08.601 DEBUG [PooledByteBufAllocator.<clinit>] - -Dio.netty.allocator.chunkSize: 4194304
06:59:08.601 DEBUG [PooledByteBufAllocator.<clinit>] - -Dio.netty.allocator.smallCacheSize: 256
06:59:08.601 DEBUG [PooledByteBufAllocator.<clinit>] - -Dio.netty.allocator.normalCacheSize: 64
06:59:08.601 DEBUG [PooledByteBufAllocator.<clinit>] - -Dio.netty.allocator.maxCachedBufferCapacity: 32768
06:59:08.601 DEBUG [PooledByteBufAllocator.<clinit>] - -Dio.netty.allocator.cacheTrimInterval: 8192
06:59:08.601 DEBUG [PooledByteBufAllocator.<clinit>] - -Dio.netty.allocator.cacheTrimIntervalMillis: 0
06:59:08.602 DEBUG [PooledByteBufAllocator.<clinit>] - -Dio.netty.allocator.useCacheForAllThreads: false
06:59:08.602 DEBUG [PooledByteBufAllocator.<clinit>] - -Dio.netty.allocator.maxCachedByteBuffersPerChunk: 1023
06:59:08.618 DEBUG [ByteBufUtil.<clinit>] - -Dio.netty.allocator.type: pooled
06:59:08.618 DEBUG [ByteBufUtil.<clinit>] - -Dio.netty.threadLocalDirectBufferSize: 0
06:59:08.619 DEBUG [ByteBufUtil.<clinit>] - -Dio.netty.maxThreadLocalCharBufferSize: 16384
06:59:08.620 DEBUG [ChannelInitializerExtensions.getExtensions] - -Dio.netty.bootstrap.extensions: null
06:59:08.642 DEBUG [LoggingHandler.channelRegistered] - [id: 0xf5b9e30c] REGISTERED
06:59:08.643 DEBUG [LoggingHandler.bind] - [id: 0xf5b9e30c] BIND: 0.0.0.0/0.0.0.0:5555
06:59:08.646 INFO [NodeServer$1.start] - Starting registration process for Node http://172.19.0.3:5555
06:59:08.647 DEBUG [LoggingHandler.channelActive] - [id: 0xf5b9e30c, L:/[0:0:0:0:0:0:0:0]:5555] ACTIVE
06:59:08.648 INFO [NodeServer.execute] - Started Selenium node 4.22.0 (revision c5f3146703): http://172.19.0.3:5555
06:59:08.665 INFO [NodeServer$1.lambda$start$1] - Sending registration event...
06:59:08.710 DEBUG [LoggingHandler.channelRead] - [id: 0xf5b9e30c, L:/[0:0:0:0:0:0:0:0]:5555] READ: [id: 0x1873089d, L:/172.19.0.3:5555 - R:/172.19.0.2:51646]
06:59:08.716 DEBUG [LoggingHandler.channelReadComplete] - [id: 0xf5b9e30c, L:/[0:0:0:0:0:0:0:0]:5555] READ COMPLETE
06:59:08.758 DEBUG [AbstractByteBuf.<clinit>] - -Dio.netty.buffer.checkAccessible: true
06:59:08.761 DEBUG [AbstractByteBuf.<clinit>] - -Dio.netty.buffer.checkBounds: true
06:59:08.762 DEBUG [ResourceLeakDetectorFactory$DefaultResourceLeakDetectorFactory.newResourceLeakDetector] - Loaded default ResourceLeakDetector: io.netty.util.ResourceLeakDetector@45d754a0
06:59:08.779 DEBUG [ZlibCodecFactory.<clinit>] - -Dio.netty.noJdkZlibDecoder: false
06:59:08.779 DEBUG [ZlibCodecFactory.<clinit>] - -Dio.netty.noJdkZlibEncoder: false
06:59:08.791 DEBUG [Recycler.<clinit>] - -Dio.netty.recycler.maxCapacityPerThread: 4096
06:59:08.792 DEBUG [Recycler.<clinit>] - -Dio.netty.recycler.ratio: 8
06:59:08.792 DEBUG [Recycler.<clinit>] - -Dio.netty.recycler.chunkSize: 32
06:59:08.792 DEBUG [Recycler.<clinit>] - -Dio.netty.recycler.blocking: false
06:59:08.792 DEBUG [Recycler.<clinit>] - -Dio.netty.recycler.batchFastThreadLocalOnly: true
06:59:08.820 DEBUG [RequestConverter.channelRead0] - Incoming message: DefaultHttpRequest(decodeResult: success, version: HTTP/1.1)
GET /status HTTP/1.1
Connection: Upgrade, HTTP2-Settings
Content-Length: 0
Host: 172.19.0.3:5555
HTTP2-Settings: AAEAAEAAAAIAAAABAAMAAABkAAQBAAAAAAUAAEAA
Upgrade: h2c
traceparent: 00-ae312ae8a8d4918aa9675d912b03824c-e26ac5951868a6a0-01
User-Agent: selenium/4.22.0 (java unix)
06:59:08.821 DEBUG [RequestConverter.channelRead0] - Start of http request: DefaultHttpRequest(decodeResult: success, version: HTTP/1.1)
GET /status HTTP/1.1
Connection: Upgrade, HTTP2-Settings
Content-Length: 0
Host: 172.19.0.3:5555
HTTP2-Settings: AAEAAEAAAAIAAAABAAMAAABkAAQBAAAAAAUAAEAA
Upgrade: h2c
traceparent: 00-ae312ae8a8d4918aa9675d912b03824c-e26ac5951868a6a0-01
User-Agent: selenium/4.22.0 (java unix)
06:59:08.847 DEBUG [RequestConverter.channelRead0] - Incoming message: EmptyLastHttpContent
06:59:08.847 DEBUG [RequestConverter.channelRead0] - End of http request: EmptyLastHttpContent
06:59:08.884 DEBUG [SpanWrappedHttpHandler.execute] - Wrapping request. Before OpenTelemetryContext{tracer=io.opentelemetry.sdk.trace.SdkTracer@646be2c3, context={}, span id=0000000000000000, trace id=00000000000000000000000000000000} and after OpenTelemetryContext{tracer=io.opentelemetry.sdk.trace.SdkTracer@646be2c3, context={opentelemetry-trace-span-key=SdkSpan{traceId=ae312ae8a8d4918aa9675d912b03824c, spanId=a9a374c9d1e81834, parentSpanContext=ImmutableSpanContext{traceId=ae312ae8a8d4918aa9675d912b03824c, spanId=e26ac5951868a6a0, traceFlags=01, traceState=ArrayBasedTraceState{entries=[]}, remote=true, valid=true}, name=node.status, kind=INTERNAL, attributes=AttributesMap{data={random.key=8a17eaa8-2d70-4728-9eba-47455e801da4}, capacity=128, totalAddedValues=1}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1721285948877779127, endEpochNanos=0}}, span id=a9a374c9d1e81834, trace id=ae312ae8a8d4918aa9675d912b03824c}
06:59:08.890 DEBUG [HttpTracing.inject] - Injecting (GET) /status into OpenTelemetrySpan{traceId=ae312ae8a8d4918aa9675d912b03824c,spanId=a9a374c9d1e81834} at org.openqa.selenium.remote.tracing.SpanWrappedHttpHandler:85
06:59:08.918 DEBUG [SeleniumSpanExporter$1.lambda$export$4] - SpanData{spanContext=ImmutableSpanContext{traceId=ae312ae8a8d4918aa9675d912b03824c, spanId=a9a374c9d1e81834, traceFlags=01, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=true}, parentSpanContext=ImmutableSpanContext{traceId=ae312ae8a8d4918aa9675d912b03824c, spanId=e26ac5951868a6a0, traceFlags=01, traceState=ArrayBasedTraceState{entries=[]}, remote=true, valid=true}, resource=Resource{schemaUrl=null, attributes={service.name="unknown_service:java", telemetry.sdk.language="java", telemetry.sdk.name="opentelemetry", telemetry.sdk.version="1.39.0"}}, instrumentationScopeInfo=InstrumentationScopeInfo{name=default, version=null, schemaUrl=null, attributes={}}, name=node.status, kind=INTERNAL, startEpochNanos=1721285948877779127, endEpochNanos=1721285948917404188, attributes=AttributesMap{data={http.status_code=200, random.key=8a17eaa8-2d70-4728-9eba-47455e801da4, span.kind=server, http.target=/status, http.method=GET}, capacity=128, totalAddedValues=5}, totalAttributeCount=5, events=[ImmutableEventData{name=HTTP request execution complete, attributes={http.flavor=1, http.handler_class="org.openqa.selenium.remote.http.Route$TemplatizedRoute", http.host="172.19.0.3:5555", http.method="GET", http.request_content_length="0", http.scheme="HTTP", http.status_code=200, http.target="/status", http.user_agent="selenium/4.22.0 (java unix)"}, epochNanos=1721285948916898117, totalAttributeCount=9}], totalRecordedEvents=1, links=[], totalRecordedLinks=0, status=ImmutableStatusData{statusCode=OK, description=Kind: OK Description:}, hasEnded=true}

Operating System

Linux

Docker Selenium version (image tag)

4.22.0

Selenium Grid chart version (chart version)

No response

github-actions[bot] commented 1 month ago

@kjjnygres, thank you for creating this issue. We will troubleshoot it as soon as we can.

Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

VietND96 commented 1 month ago

I think you can retest after this PR will be released in 4.23 https://github.com/SeleniumHQ/selenium/pull/14272

kjjnygres commented 1 month ago

Has the issue been identified? What is the issue here?

VietND96 commented 1 month ago

Are using Grid autoscaling in K8s? If yes, it could relate to another issue is described in https://github.com/SeleniumHQ/selenium/pull/14282

kjjnygres commented 1 month ago

No. I am spinning and destroying containers on demand.

VietND96 commented 1 month ago

Yes, so it could be related. When destroying the containers, I think the drain node API endpoint is used to detach the Node from the Hub. The node will have a chance to switch status from UP to DRAINING (for some time until the container stops completely).

kjjnygres commented 1 month ago

Oh. Waiting for the new release then. Thanks for explaining

edsherwin commented 1 month ago

Waiting for this release as well as I encountered issues when running parallel tests using Kubernetes with auto KEDA autoscaling. Hope this will fix my issue. Thanks

edsherwin commented 1 month ago

I have an issue also when running parallel tests wherein the ongoing tests (session or the session id) will be hijack by the incoming tests and that session will be closed even if there is still an ongoing test from owner of that session.

Unable to find session with ID: b77e7338533a3ccbd9c70eee277c156d

Nirioppai commented 1 month ago

I also encountered this issue, I currently bypass this by scheduling my tests to not run in parallel, Waiting for the release as well. Thank you.

VietND96 commented 1 month ago

Images tag 4.23.0-20240727 and Helm chart selenium-grid-0.33.0 contain 2 needed fixes mentioned above. Can you please verify and confirm.

kjjnygres commented 1 month ago

Issue still persists for me. Node is created, but is waiting in the queue. Session never starts.

Message: Could not start a new session. New session request timed out

Host info: host: '801aca98ceec', ip: '172.19.0.2'

Build info: version: '4.23.0', revision: '77010cd'

System info: os.name: 'Linux', os.arch: 'amd64', os.version: '5.14.0-427.22.1.el9_4.x86_64', java.version: '17.0.11'

Driver info: driver.version: unknown

Stacktrace:

at org.openqa.selenium.grid.sessionqueue.local.LocalNewSessionQueue.addToQueue (LocalNewSessionQueue.java:221)

at org.openqa.selenium.grid.sessionqueue.NewSessionQueue.lambda$new$0 (NewSessionQueue.java:68)

at org.openqa.selenium.remote.http.Route$TemplatizedRoute.handle (Route.java:192)

at org.openqa.selenium.remote.http.Route.execute (Route.java:69)

at org.openqa.selenium.remote.http.Route$CombinedRoute.handle (Route.java:360)

at org.openqa.selenium.remote.http.Route.execute (Route.java:69)

at org.openqa.selenium.grid.sessionqueue.NewSessionQueue.execute (NewSessionQueue.java:128)

at org.openqa.selenium.remote.tracing.SpanWrappedHttpHandler.execute (SpanWrappedHttpHandler.java:87)

at org.openqa.selenium.remote.http.Filter$1.execute (Filter.java:63)

at org.openqa.selenium.remote.http.Route$CombinedRoute.handle (Route.java:360)

at org.openqa.selenium.remote.http.Route.execute (Route.java:69)

at org.openqa.selenium.grid.router.Router.execute (Router.java:87)

at org.openqa.selenium.grid.web.EnsureSpecCompliantResponseHeaders.lambda$apply$0 (EnsureSpecCompliantResponseHeaders.java:34)

at org.openqa.selenium.remote.http.Filter$1.execute (Filter.java:63)

at org.openqa.selenium.remote.http.Route$CombinedRoute.handle (Route.java:360)

at org.openqa.selenium.remote.http.Route.execute (Route.java:69)

at org.openqa.selenium.remote.AddWebDriverSpecHeaders.lambda$apply$0 (AddWebDriverSpecHeaders.java:35)

at org.openqa.selenium.remote.ErrorFilter.lambda$apply$0 (ErrorFilter.java:44)

at org.openqa.selenium.remote.http.Filter$1.execute (Filter.java:63)

at org.openqa.selenium.remote.ErrorFilter.lambda$apply$0 (ErrorFilter.java:44)

at org.openqa.selenium.remote.http.Filter$1.execute (Filter.java:63)

at org.openqa.selenium.netty.server.SeleniumHandler.lambda$channelRead0$0 (SeleniumHandler.java:44)

at java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:539)

at java.util.concurrent.FutureTask.run (FutureTask.java:264)

at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1136)

at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:635)

at java.lang.Thread.run (Thread.java:840)

VietND96 commented 1 month ago

@kjjnygres, from client binding, can you provide all capabilities that you provided? I suspect that request session not match with Node stereotypes, so there is no session could start

VietND96 commented 1 month ago

Ok, I saw you are mentioning the scenario is setting custom capabilities for matching specific Nodes. For this, while doing the PR #2323 - adding caps se:containerName default in Node container stereotypes. I also saw something incorrectly reflected in node Chromium (abstract of Chrome and Edge) - se:containerName present in Node stereotypes however it didn't present in session caps (for example below screenshot). So, probably when session request set custom caps, it could not assign to any Node. Can you confirm that, in your setup, if request Firefox node, it is stilling working fine?

Node stereotypes

Session capabilities (Chrome)

Session capabilities (Firefox)

kjjnygres commented 1 month ago

I think the issue still is what you mentioned before: after destroying the container and spinning it up with the same name, there might be some issue with its status. which is causing this issue.

Stacktrace in last comment was my mistake. I was sending in wrong capabilities. Please find updated stacktrace below:

Message: Could not start a new session. New session request timed out

Host info: host: '801aca98ceec', ip: '172.19.0.2'

Build info: version: '4.23.0', revision: '77010cd'

System info: os.name: 'Linux', os.arch: 'amd64', os.version: '5.14.0-427.22.1.el9_4.x86_64', java.version: '17.0.11'

Driver info: driver.version: unknown

Stacktrace:

at org.openqa.selenium.grid.sessionqueue.local.LocalNewSessionQueue.addToQueue (LocalNewSessionQueue.java:221)

at org.openqa.selenium.grid.sessionqueue.NewSessionQueue.lambda$new$0 (NewSessionQueue.java:68)

at org.openqa.selenium.remote.http.Route$TemplatizedRoute.handle (Route.java:192)

at org.openqa.selenium.remote.http.Route.execute (Route.java:69)

at org.openqa.selenium.remote.http.Route$CombinedRoute.handle (Route.java:360)

at org.openqa.selenium.remote.http.Route.execute (Route.java:69)

at org.openqa.selenium.grid.sessionqueue.NewSessionQueue.execute (NewSessionQueue.java:128)

at org.openqa.selenium.remote.tracing.SpanWrappedHttpHandler.execute (SpanWrappedHttpHandler.java:87)

at org.openqa.selenium.remote.http.Filter$1.execute (Filter.java:63)

at org.openqa.selenium.remote.http.Route$CombinedRoute.handle (Route.java:360)

at org.openqa.selenium.remote.http.Route.execute (Route.java:69)

at org.openqa.selenium.grid.router.Router.execute (Router.java:87)

at org.openqa.selenium.grid.web.EnsureSpecCompliantResponseHeaders.lambda$apply$0 (EnsureSpecCompliantResponseHeaders.java:34)

at org.openqa.selenium.remote.http.Filter$1.execute (Filter.java:63)

at org.openqa.selenium.remote.http.Route$CombinedRoute.handle (Route.java:360)

at org.openqa.selenium.remote.http.Route.execute (Route.java:69)

at org.openqa.selenium.remote.AddWebDriverSpecHeaders.lambda$apply$0 (AddWebDriverSpecHeaders.java:35)

at org.openqa.selenium.remote.ErrorFilter.lambda$apply$0 (ErrorFilter.java:44)

at org.openqa.selenium.remote.http.Filter$1.execute (Filter.java:63)

at org.openqa.selenium.remote.ErrorFilter.lambda$apply$0 (ErrorFilter.java:44)

at org.openqa.selenium.remote.http.Filter$1.execute (Filter.java:63)

at org.openqa.selenium.netty.server.SeleniumHandler.lambda$channelRead0$0 (SeleniumHandler.java:44)

at java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:539)

at java.util.concurrent.FutureTask.run (FutureTask.java:264)

at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1136)

at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:635)

at java.lang.Thread.run (Thread.java:840)

kjjnygres commented 1 month ago

I am using capabilities like this and it is working fine for me.

I am using this capability from my code:

SE_NODE_STEREOTYPE='{"browserName":"chrome","browserVersion":"122.0","goog:chromeOptions":{"binary":"/usr/bin/google-chrome"},"platformName":"linux","se:noVncPort":7900,"se:vncEnabled":true, "nodename:applicationName":"AccountCreation1"}'

VietND96 commented 1 month ago

Ok, Node status switches to DRAINING from UP only API endpoint drain node is called - https://www.selenium.dev/documentation/grid/advanced_features/endpoints/#drain. This is something to do in container entry point, since via docker stop, it could go to a situation where Node status is still UP, and suddenly it down by docker stop, the DOWN status not come to Hub immediately (since health checks trigger by interval)

kjjnygres commented 1 month ago

So may be I have to wait between docker stop and docker run for a same container?

VietND96 commented 1 month ago

Let me try to add a mechanism graceful shutdown the Node when deploying it via docker, or docker-compose

VietND96 commented 1 month ago

@kjjnygres Btw, do you have any script publicly that I can use to test your scenario with the implementation?

kjjnygres commented 1 month ago

I'm afraid no. But what I do is before running a container, I check if container with the same name is already up or not. If so, I destroy that container and then run with the same name. There is not much time between these two activities. I think you can reproduce by creating a small script wherein you first run a container, destroy it, and run again.

Here is how I do it in my code:

sh script: "docker stop ${job} || true && docker rm ${job} || true"

sh script: "docker run -d --net grid --name ${job} -e SE_EVENT_BUS_HOST=selenium-hub -e SE_NODE_STEREOTYPE='{\"browserName\":\"chrome\",\"browserVersion\":\"122.0\",\"goog:chromeOptions\":{\"binary\":\"/usr/bin/google-chrome\"},\"platformName\":\"linux\",\"se:noVncPort\":7900,\"se:vncEnabled\":true, \"nodename:applicationName\":\"${job}\"}' --cap-add=CAP_AUDIT_WRITE --shm-size=\"2g\" -e SE_EVENT_BUS_PUBLISH_PORT=4442 -e SE_EVENT_BUS_SUBSCRIBE_PORT=4443 localhost/node-image-xx"

kjjnygres commented 1 month ago

Just an update: it is now happening very rarely. I almost forgot about this issue :)

edsherwin commented 1 month ago

I also encountered the same issue using standalone docker configuration. The tests will trigger in parallel and creates browser sessions. However, the test execution for each session is overriding.

`version: "3" services: selenium-event-bus: image: selenium/event-bus:4.23.0-20240727 container_name: selenium-event-bus ports:

"4442:4442"
"4443:4443"
"5557:5557" networks:
grid restart: unless-stopped

selenium-sessions: image: selenium/sessions:4.23.0-20240727 container_name: selenium-sessions ports:
"5556:5556" depends_on:
selenium-event-bus environment:
SE_EVENT_BUS_HOST=selenium-event-bus
SE_EVENT_BUS_PUBLISH_PORT=4442
SE_EVENT_BUS_SUBSCRIBE_PORT=4443 networks:
grid restart: unless-stopped

selenium-session-queue: image: selenium/session-queue:4.23.0-20240727 container_name: selenium-session-queue ports:
"5559:5559" depends_on:
selenium-event-bus environment:
SE_EVENT_BUS_HOST=selenium-event-bus
SE_EVENT_BUS_PUBLISH_PORT=4442
SE_EVENT_BUS_SUBSCRIBE_PORT=4443
SE_SESSION_REQUEST_TIMEOUT=700 networks:
grid restart: unless-stopped

selenium-distributor: image: selenium/distributor:4.23.0-20240727 container_name: selenium-distributor ports:
"5553:5553" depends_on:
selenium-event-bus
selenium-sessions
selenium-session-queue environment:
SE_EVENT_BUS_HOST=selenium-event-bus
SE_EVENT_BUS_PUBLISH_PORT=4442
SE_EVENT_BUS_SUBSCRIBE_PORT=4443
SE_SESSIONS_MAP_HOST=selenium-sessions
SE_SESSIONS_MAP_PORT=5556
SE_SESSION_QUEUE_HOST=selenium-session-queue
SE_SESSION_QUEUE_PORT=5559 networks:
grid restart: unless-stopped

selenium-router: image: selenium/router:4.23.0-20240727 container_name: selenium-router ports:
"4444:4444" depends_on:
selenium-distributor
selenium-sessions
selenium-session-queue environment:
SE_DISTRIBUTOR_HOST=selenium-distributor
SE_DISTRIBUTOR_PORT=5553
SE_SESSIONS_MAP_HOST=selenium-sessions
SE_SESSIONS_MAP_PORT=5556
SE_SESSION_QUEUE_HOST=selenium-session-queue
SE_SESSION_QUEUE_PORT=5559
SE_OPTS=--log-level FINE networks:
grid restart: unless-stopped

chrome: image: selenium/node-chrome:4.23.0-20240727 shm_size: 2gb depends_on:
selenium-event-bus
- selenium-router

environment:
SE_EVENT_BUS_HOST=selenium-event-bus
SE_EVENT_BUS_PUBLISH_PORT=4442
SE_EVENT_BUS_SUBSCRIBE_PORT=4443
SE_NODE_GRID_URL=http://localhost:4444
SE_SCREEN_WIDTH=2560
SE_JAVA_OPTS=-Xmx8192m
SE_SCREEN_HEIGHT=1440
SE_SCREEN_DEPTH=24
SE_NODE_PORT=5555
SE_SESSION_REQUEST_TIMEOUT=800
SE_SESSION_RETRY_INTERVAL=2
SE_NODE_MAX_SESSIONS=20
SE_NODE_OVERRIDE_MAX_SESSIONS=true
SE_NODE_SESSION_TIMEOUT=6000
SE_VNC_NO_PASSWORD=1
SE_VNC_VIEW_ONLY=1
SE_VNC_ULIMIT=4096 networks:
grid ulimits: nofile: soft: 65536 hard: 65536 restart: unless-stopped

networks: grid: driver: bridge`

Btw, I'm using simple robotframework script using browser library.

` Settings Library Browser

Test Cases Debug 001 New Browser browser=chromium headless=False args=["--start-maximized"] New Context ignoreHTTPSErrors=True viewport=${None} New Page https://robocon.io Sleep 10s`

command to execute in terminal

SELENIUM_REMOTE_URL=http://localhost:4444 robot -v TS.robot

github-actions[bot] commented 1 day ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

SeleniumHQ / docker-selenium