eclipse-ee4j / grizzly

Grizzly
https://eclipse-ee4j.github.io/grizzly
Other
148 stars 69 forks source link

Simultaneous reading and writing causes NPE. #2005

Closed facundovs closed 1 year ago

facundovs commented 6 years ago

Hi guys.

In the Mule ESB we are facing the following sporadic NPEs with the following stack trace:

04-16 09:18:59.301] DEBUG HttpCodecFilter [[featurespace-system-service-uat].http.requester.HTTPS_Request_Configuration_Featurespace.worker(2)]: Error parsing HTTP header java.lang.NullPointerException: null at org.glassfish.grizzly.http.HttpClientFilter$ClientHttpResponseImpl.getProcessingState(HttpClientFilter.java:719) ~[grizzly-http-2.3.35.jar:2.3.35] at com.ning.http.client.providers.grizzly.HttpTransactionContext.currentTransaction(HttpTransactionContext.java:134) ~[grizzly-http-client-1.14.jar:?] at com.ning.http.client.providers.grizzly.AhcEventFilter.onInitialLineParsed(AhcEventFilter.java:181) ~[grizzly-http-client-1.14.jar:?] at org.glassfish.grizzly.http.HttpClientFilter.decodeInitialLineFromBuffer(HttpClientFilter.java:542) ~[grizzly-http-2.3.35.jar:2.3.35] at org.glassfish.grizzly.http.HttpCodecFilter.decodeHttpPacketFromBuffer(HttpCodecFilter.java:1047) ~[grizzly-http-2.3.35.jar:2.3.35] at org.glassfish.grizzly.http.HttpCodecFilter.decodeHttpPacket(HttpCodecFilter.java:743) ~[grizzly-http-2.3.35.jar:2.3.35] at org.glassfish.grizzly.http.HttpCodecFilter.handleRead(HttpCodecFilter.java:589) [grizzly-http-2.3.35.jar:2.3.35] at org.glassfish.grizzly.http.HttpClientFilter.handleRead(HttpClientFilter.java:175) [grizzly-http-2.3.35.jar:2.3.35] at org.glassfish.grizzly.filterchain.ExecutorResolver$9.execute(ExecutorResolver.java:119) [grizzly-framework-2.3.35.jar:2.3.35] at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeFilter(DefaultFilterChain.java:284) [grizzly-framework-2.3.35.jar:2.3.35] at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeChainPart(DefaultFilterChain.java:201) [grizzly-framework-2.3.35.jar:2.3.35] at org.glassfish.grizzly.filterchain.DefaultFilterChain.execute(DefaultFilterChain.java:133) [grizzly-framework-2.3.35.jar:2.3.35] at org.glassfish.grizzly.filterchain.DefaultFilterChain.process(DefaultFilterChain.java:112) [grizzly-framework-2.3.35.jar:2.3.35] at org.glassfish.grizzly.ProcessorExecutor.execute(ProcessorExecutor.java:77) [grizzly-framework-2.3.35.jar:2.3.35] at org.glassfish.grizzly.nio.transport.TCPNIOTransport.fireIOEvent(TCPNIOTransport.java:539) [grizzly-framework-2.3.35.jar:2.3.35] at org.glassfish.grizzly.strategies.AbstractIOStrategy.fireIOEvent(AbstractIOStrategy.java:112) [grizzly-framework-2.3.35.jar:2.3.35] at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy.run0(WorkerThreadIOStrategy.java:117) [grizzly-framework-2.3.35.jar:2.3.35] at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy.access$100(WorkerThreadIOStrategy.java:56) [grizzly-framework-2.3.35.jar:2.3.35] at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy$WorkerThreadRunnable.run(WorkerThreadIOStrategy.java:137) [grizzly-framework-2.3.35.jar:2.3.35] at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:593) [grizzly-framework-2.3.35.jar:2.3.35] at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:573) [grizzly-framework-2.3.35.jar:2.3.35] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]

After some investigation I could discover that:

1) In the scenario when the issue Is reproduced, the non-blocking requestQueue saved in the connection as an atribute returns NULL in this line. This seems to be a race condition what takes place when the writting and the reading are done asynchronously (and the reading occurs faster than the writting, which puts the request packet in the queue after the reading thread polls it).

2) The issue is a consequence of the interaction between Grizzly AHC and the Grizzly Framework:

at org.glassfish.grizzly.http.HttpClientFilter.handleWrite(HttpClientFilter.java:126) at org.glassfish.grizzly.filterchain.ExecutorResolver$8.execute(ExecutorResolver.java:111) at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeFilter(DefaultFilterChain.java:284) at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeChainPart(DefaultFilterChain.java:201) at org.glassfish.grizzly.filterchain.DefaultFilterChain.execute(DefaultFilterChain.java:133) at org.glassfish.grizzly.filterchain.DefaultFilterChain.process(DefaultFilterChain.java:112) at org.glassfish.grizzly.ProcessorExecutor.execute(ProcessorExecutor.java:77) at org.glassfish.grizzly.filterchain.FilterChainContext.write(FilterChainContext.java:890) at org.glassfish.grizzly.filterchain.FilterChainContext.write(FilterChainContext.java:756) at com.ning.http.client.providers.grizzly.PayloadGenFactory$ByteArrayPayloadGenerator.generate(PayloadGenFactory.java:154) at com.ning.http.client.providers.grizzly.AsyncHttpClientFilter.sendRequest(AsyncHttpClientFilter.java:273) at com.ning.http.client.providers.grizzly.AsyncHttpClientFilter.sendAsGrizzlyRequest(AsyncHttpClientFilter.java:202) at com.ning.http.client.providers.grizzly.AsyncHttpClientFilter.handleWrite(AsyncHttpClientFilter.java:90) at org.glassfish.grizzly.filterchain.ExecutorResolver$8.execute(ExecutorResolver.java:111) at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeFilter(DefaultFilterChain.java:284) at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeChainPart(DefaultFilterChain.java:201) at org.glassfish.grizzly.filterchain.DefaultFilterChain.execute(DefaultFilterChain.java:133) at org.glassfish.grizzly.filterchain.DefaultFilterChain.process(DefaultFilterChain.java:112) at org.glassfish.grizzly.ProcessorExecutor.execute(ProcessorExecutor.java:77) at org.glassfish.grizzly.filterchain.DefaultFilterChain.write(DefaultFilterChain.java:414) at org.glassfish.grizzly.nio.NIOConnection.write(NIOConnection.java:450) at org.glassfish.grizzly.nio.NIOConnection.write(NIOConnection.java:424) at com.ning.http.client.providers.grizzly.GrizzlyAsyncHttpProvider.execute(GrizzlyAsyncHttpProvider.java:224) at com.ning.http.client.providers.grizzly.GrizzlyAsyncHttpProvider$1.completed(GrizzlyAsyncHttpProvider.java:150) at com.ning.http.client.providers.grizzly.GrizzlyAsyncHttpProvider$1.completed(GrizzlyAsyncHttpProvider.java:130) at org.glassfish.grizzly.utils.Futures.notifyResult(Futures.java:110) at org.glassfish.grizzly.connectionpool.SingleEndpointPool$ConnectCompletionHandler.completed(SingleEndpointPool.java:1278) at org.glassfish.grizzly.connectionpool.SingleEndpointPool$ConnectCompletionHandler.completed(SingleEndpointPool.java:1245) at org.glassfish.grizzly.impl.SafeFutureImpl.notifyCompletionHandlers(SafeFutureImpl.java:185) at org.glassfish.grizzly.impl.SafeFutureImpl.done(SafeFutureImpl.java:273) at org.glassfish.grizzly.impl.SafeFutureImpl$Sync.innerSet(SafeFutureImpl.java:369) at org.glassfish.grizzly.impl.SafeFutureImpl.result(SafeFutureImpl.java:109) at org.glassfish.grizzly.utils.CompletionHandlerAdapter.completed(CompletionHandlerAdapter.java:155) at org.glassfish.grizzly.nio.transport.TCPNIOConnectorHandler$EnableReadHandler.onComplete(TCPNIOConnectorHandler.java:348) at org.glassfish.grizzly.ProcessorExecutor.complete(ProcessorExecutor.java:115) at org.glassfish.grizzly.ProcessorExecutor.complete0(ProcessorExecutor.java:208) at org.glassfish.grizzly.ProcessorExecutor.execute(ProcessorExecutor.java:86) at org.glassfish.grizzly.nio.transport.TCPNIOTransport.fireIOEvent(TCPNIOTransport.java:539) at org.glassfish.grizzly.nio.transport.TCPNIOConnectorHandler.onConnectedAsync(TCPNIOConnectorHandler.java:226) at org.glassfish.grizzly.nio.transport.TCPNIOConnectorHandler$1.connected(TCPNIOConnectorHandler.java:158) at org.glassfish.grizzly.nio.transport.TCPNIOConnection.onConnect(TCPNIOConnection.java:258) at org.glassfish.grizzly.nio.transport.TCPNIOTransport.fireIOEvent(TCPNIOTransport.java:530) at org.glassfish.grizzly.strategies.AbstractIOStrategy.fireIOEvent(AbstractIOStrategy.java:112) at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy.run0(WorkerThreadIOStrategy.java:117) at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy.executeIoEvent(WorkerThreadIOStrategy.java:103) at org.glassfish.grizzly.strategies.AbstractIOStrategy.executeIoEvent(AbstractIOStrategy.java:89) at org.glassfish.grizzly.nio.SelectorRunner.iterateKeyEvents(SelectorRunner.java:415) at org.glassfish.grizzly.nio.SelectorRunner.iterateKeys(SelectorRunner.java:384) at org.glassfish.grizzly.nio.SelectorRunner.doSelect(SelectorRunner.java:348) at org.glassfish.grizzly.nio.SelectorRunner.run(SelectorRunner.java:279) at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:593) at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:573)

at com.ning.http.client.providers.grizzly.GrizzlyAsyncHttpProvider$1.completed(GrizzlyAsyncHttpProvider.java:130) at org.glassfish.grizzly.utils.Futures.notifyResult(Futures.java:110) at org.glassfish.grizzly.connectionpool.SingleEndpointPool$ConnectCompletionHandler.completed(SingleEndpointPool.java:1278) at org.glassfish.grizzly.connectionpool.SingleEndpointPool$ConnectCompletionHandler.completed(SingleEndpointPool.java:1245) at org.glassfish.grizzly.impl.SafeFutureImpl.notifyCompletionHandlers(SafeFutureImpl.java:185) at org.glassfish.grizzly.impl.SafeFutureImpl.addCompletionHandler(SafeFutureImpl.java:83) at org.glassfish.grizzly.connectionpool.SingleEndpointPool.connect(SingleEndpointPool.java:1157) at org.glassfish.grizzly.connectionpool.SingleEndpointPool.take(SingleEndpointPool.java:788) at org.glassfish.grizzly.connectionpool.MultiEndpointPool.take(MultiEndpointPool.java:592) at com.ning.http.client.providers.grizzly.ConnectionManager.openAsync(ConnectionManager.java:143) at com.ning.http.client.providers.grizzly.GrizzlyAsyncHttpProvider.execute(GrizzlyAsyncHttpProvider.java:174) at com.ning.http.client.AsyncHttpClient.executeRequest(AsyncHttpClient.java:505) at com.ning.http.client.AsyncHttpClient.executeRequest(AsyncHttpClient.java:527) at org.mule.module.http.internal.request.grizzly.GrizzlyHttpClient.sendAndWait(GrizzlyHttpClient.java:360) at org.mule.module.http.internal.request.grizzly.GrizzlyHttpClient.send(GrizzlyHttpClient.java:307)

As you see; the race condition occurs between the requester thread and the selector thread. I was able to reproduce the case, creating a SocketServer that just starts to read the HTTP response when the connection is stablished. I am aware of that writting the response without reading the request first or reading the request and writting the response asynchronously are not the most common scenarios; but shouldn't this cases be contemplated? Anyway, Failing with a NPE is not aceptable... At least a validation with a descriptive error should be added.

Note: I've also see that the read operation is enabled as soon the connection is ready, what is the root cause of the race condition, as you see here.

Thank you very much.

facundovs commented 6 years ago

I've created this test case to reproduce the issue.

Just clone my fork and run it.

fsgonz commented 6 years ago

I think the problem arises because the completion handler in the connect method in SingleEndpointPool is registered after connect. Internally as there is no completionHandler a default completion handler is used and the read operation is enabled resulting in the race condition (it is possible that the read operation is performed before the write).

This happens here:

https://github.com/eclipse-ee4j/grizzly/blob/59e799397c70301f7c5f5d3d75933db7fa073e66/extras/connection-pool/src/main/java/org/glassfish/grizzly/connectionpool/SingleEndpointPool.java#L1132

I've created a possible fix (but it involves changing the interface), so that the completionHandler can be passed and the future can retrieved to manage the timeout.

https://github.com/fsgonz/grizzly-1/commit/f8edf95a7e404bd147a6b8e36ec53a9209f0f6d8

This also involves changing the AhcEndpoint in ahc.

Let me know what you think about these possible changes.

fsgonz commented 6 years ago

Notice that I don't find a way to avoid the race condition without changing the endpoint abstract class.