Closed ctron closed 6 years ago
Ok so in this case it isn't a NPE, but indeed the exception message is missing:
17:29:59.227 [vert.x-eventloop-thread-0] DEBUG o.e.h.a.h.AbstractVertxBasedHttpProtocolAdapter - cannot process message for device [tenantId: DEFAULT_TENANT, deviceId: hono-demo-1-simulator-http-47-kx8g6-297, endpoint: telemetry]
org.eclipse.hono.client.ServerErrorException: null
at org.eclipse.hono.client.impl.HonoClientImpl.getOrCreateRequestResponseClient(HonoClientImpl.java:460)
at org.eclipse.hono.client.impl.HonoClientImpl.getOrCreateRegistrationClient(HonoClientImpl.java:471)
at org.eclipse.hono.service.AbstractProtocolAdapterBase.getRegistrationClient(AbstractProtocolAdapterBase.java:387)
at org.eclipse.hono.service.AbstractProtocolAdapterBase.getRegistrationAssertion(AbstractProtocolAdapterBase.java:413)
at org.eclipse.hono.adapter.http.AbstractVertxBasedHttpProtocolAdapter.doUploadMessage(AbstractVertxBasedHttpProtocolAdapter.java:471)
at org.eclipse.hono.adapter.http.AbstractVertxBasedHttpProtocolAdapter.uploadTelemetryMessage(AbstractVertxBasedHttpProtocolAdapter.java:400)
at org.eclipse.hono.adapter.http.AbstractVertxBasedHttpProtocolAdapter.uploadTelemetryMessage(AbstractVertxBasedHttpProtocolAdapter.java:371)
at org.eclipse.hono.adapter.http.vertx.VertxBasedHttpProtocolAdapter.lambda$addTelemetryApiRoutes$0(VertxBasedHttpProtocolAdapter.java:81)
at io.vertx.ext.web.impl.RouteImpl.handleContext(RouteImpl.java:223)
at io.vertx.ext.web.impl.RoutingContextImplBase.iterateNext(RoutingContextImplBase.java:101)
at io.vertx.ext.web.impl.RoutingContextImpl.next(RoutingContextImpl.java:120)
at io.vertx.ext.web.handler.impl.BodyHandlerImpl$BHandler.doEnd(BodyHandlerImpl.java:209)
at io.vertx.ext.web.handler.impl.BodyHandlerImpl$BHandler.end(BodyHandlerImpl.java:187)
at io.vertx.ext.web.handler.impl.BodyHandlerImpl.lambda$handle$0(BodyHandlerImpl.java:68)
at io.vertx.core.http.impl.HttpServerRequestImpl.handleEnd(HttpServerRequestImpl.java:418)
at io.vertx.core.http.impl.ServerConnection.handleLastHttpContent(ServerConnection.java:475)
at io.vertx.core.http.impl.ServerConnection.handleContent(ServerConnection.java:469)
at io.vertx.core.http.impl.ServerConnection.processMessage(ServerConnection.java:449)
at io.vertx.core.http.impl.ServerConnection.handleMessage(ServerConnection.java:156)
at io.vertx.core.http.impl.HttpServerImpl$ServerHandlerWithWebSockets.handleMessage(HttpServerImpl.java:705)
at io.vertx.core.http.impl.HttpServerImpl$ServerHandlerWithWebSockets.handleMessage(HttpServerImpl.java:614)
at io.vertx.core.net.impl.VertxHandler.lambda$channelRead$1(VertxHandler.java:150)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:342)
at io.vertx.core.impl.ContextImpl.executeFromIO(ContextImpl.java:200)
at io.vertx.core.net.impl.VertxHandler.channelRead(VertxHandler.java:148)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.vertx.core.http.impl.Http1xOrH2CHandler.end(Http1xOrH2CHandler.java:49)
at io.vertx.core.http.impl.Http1xOrH2CHandler.channelRead(Http1xOrH2CHandler.java:27)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at java.lang.Thread.run(Thread.java:748)
Which is caused by (HonoClientImpl.java):
resultHandler.handle(Future.failedFuture(new ServerErrorException(HttpURLConnection.HTTP_UNAVAILABLE)));
I am not sure if I understand your point. Once the connection between the protocol adapter and the service it was connected to is terminated (by the service), the adapter needs to re-connect to the service. During the time it takes to re-establish the connection, the adapter will not be able to handle any requests. In case of the HTTP adapter, the device(s) should thus get 503 responses. Once the downstream connections have been re-established, messages should be processed again.
Are you saying that the adapter is not able to re-connect to the downstream service(s) at all?
Yes, having a short period of down-time would not be ideal, but fine. I also think there could be an improvement on the "second" (and following) request, which don't get queued when requesting a new sender.
But what I do see is that the connection never comes back:
Messages seem to be discarded at the level of the HTTP adapter, at least that is what the Hono metrics suggest. The messaging service does have a connection established to the qrouterd, but now messages flow over the link.
Are you experiencing this in conjunction with enMasse? We have seen that at some times Hono Messaging is not able to establish a sender link with the router of enMasse. In this case, the protocol adapter also cannot establish a sender link to Hono Messaging even if both connections (protocol adapter -> Hono Messaging -> router) are established.
Yes I do. I do think what you describe matches the picture I see. After digging more into this I think the connection is established. But either the sender link could not be established or doesn't receive any credit:
If the sender link cannot be established, shouldn't there be some kind of fallback?
Hmm, what could that be? Are you talking about something like:
I think here it is visible that this is an issue between the HTTP adapter and the messaging component:
Ok, not necessarily. As the messaging component still seems to have one link open per inbound connection from the HTTP adapter:
When I turn up the log to trace in the HTTP adapter I see the following:
January 24th 2018, 15:32:46.438 eclipsehono-hono-adapter-http-vertx 14:32:46.438 [vert.x-eventloop-thread-0] DEBUG o.e.h.a.h.AbstractVertxBasedHttpProtocolAdapter - cannot process message for device [tenantId: DEFAULT_TENANT, deviceId: hono-demo-1-simulator-http-51-20m5h-458, endpoint: telemetry]
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx 14:32:46.450 [vert.x-eventloop-thread-0] DEBUG o.e.hono.client.impl.HonoClientImpl - already trying to create a message sender for telemetry/DEFAULT_TENANT
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx 14:32:46.450 [vert.x-eventloop-thread-0] DEBUG o.e.hono.client.impl.HonoClientImpl - already trying to create a client for registration/DEFAULT_TENANT
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx 14:32:46.450 [vert.x-eventloop-thread-0] DEBUG o.e.h.a.h.AbstractVertxBasedHttpProtocolAdapter - cannot process message for device [tenantId: DEFAULT_TENANT, deviceId: hono-demo-1-simulator-http-51-20m5h-492, endpoint: telemetry]
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx org.eclipse.hono.client.ServerErrorException: no connection to service
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx at org.eclipse.hono.client.impl.HonoClientImpl.getOrCreateRequestResponseClient(HonoClientImpl.java:485)
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx at org.eclipse.hono.client.impl.HonoClientImpl.getOrCreateRegistrationClient(HonoClientImpl.java:497)
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx at org.eclipse.hono.service.AbstractProtocolAdapterBase.getRegistrationClient(AbstractProtocolAdapterBase.java:403)
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx at org.eclipse.hono.service.AbstractProtocolAdapterBase.getRegistrationAssertion(AbstractProtocolAdapterBase.java:429)
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx at org.eclipse.hono.adapter.http.AbstractVertxBasedHttpProtocolAdapter.doUploadMessage(AbstractVertxBasedHttpProtocolAdapter.java:469)
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx at org.eclipse.hono.adapter.http.AbstractVertxBasedHttpProtocolAdapter.uploadTelemetryMessage(AbstractVertxBasedHttpProtocolAdapter.java:398)
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx at org.eclipse.hono.adapter.http.AbstractVertxBasedHttpProtocolAdapter.uploadTelemetryMessage(AbstractVertxBasedHttpProtocolAdapter.java:369)
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx at org.eclipse.hono.adapter.http.vertx.VertxBasedHttpProtocolAdapter.lambda$addTelemetryApiRoutes$0(VertxBasedHttpProtocolAdapter.java:82)
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx at io.vertx.ext.web.impl.RouteImpl.handleContext(RouteImpl.java:223)
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx at io.vertx.ext.web.impl.RoutingContextImplBase.iterateNext(RoutingContextImplBase.java:101)
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx at io.vertx.ext.web.impl.RoutingContextImpl.next(RoutingContextImpl.java:120)
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx at io.vertx.ext.web.handler.impl.BodyHandlerImpl$BHandler.doEnd(BodyHandlerImpl.java:209)
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx at io.vertx.ext.web.handler.impl.BodyHandlerImpl$BHandler.end(BodyHandlerImpl.java:187)
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx at io.vertx.ext.web.handler.impl.BodyHandlerImpl.lambda$handle$0(BodyHandlerImpl.java:68)
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx at io.vertx.core.http.impl.HttpServerRequestImpl.handleEnd(HttpServerRequestImpl.java:418)
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx at io.vertx.core.http.impl.ServerConnection.handleLastHttpContent(ServerConnection.java:475)
January 24th 2018, 15:32:46.450 eclipsehono-hono-adapter-http-vertx at io.vertx.core.http.impl.ServerConnection.handleContent(ServerConnection.java:469)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.vertx.core.http.impl.Http1xOrH2CHandler.channelRead(Http1xOrH2CHandler.java:27)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.vertx.core.http.impl.ServerConnection.processMessage(ServerConnection.java:449)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.vertx.core.http.impl.HttpServerImpl$ServerHandlerWithWebSockets.handleMessage(HttpServerImpl.java:614)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.vertx.core.net.impl.VertxHandler.lambda$channelRead$1(VertxHandler.java:150)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:342)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.vertx.core.impl.ContextImpl.executeFromIO(ContextImpl.java:200)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.vertx.core.net.impl.VertxHandler.channelRead(VertxHandler.java:148)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.vertx.core.http.impl.Http1xOrH2CHandler.end(Http1xOrH2CHandler.java:49)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.vertx.core.http.impl.HttpServerImpl$ServerHandlerWithWebSockets.handleMessage(HttpServerImpl.java:705)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at java.lang.Thread.run(Thread.java:748)
January 24th 2018, 15:32:46.451 eclipsehono-hono-adapter-http-vertx at io.vertx.core.http.impl.ServerConnection.handleMessage(ServerConnection.java:156)
It might recover from that after 10 minutes exactly. Probably the next time the link gets closed.
I have a hunch in which direction this might go.
The HonoClientImpl
is dependent on the information of isConnected()
, which it implements by using isDisconnected()
from ProtonConnection
. Which implements this in ProtonConnectionImpl
as:
public boolean isDisconnected() {
return transport == null;
}
However a call to ProtonConnection#disconnect()
does:
public void disconnect() {
if (transport != null) {
transport.disconnect();
}
}
So calling disconnect()
and then isDisconnected()
should still return false
, as the transport still is non-null.
I will try to test this tomorrow.
After making the changes and re-running them in the cluster. The problem appears to be gone and my theory correct. I will continue monitoring this in my test. But I am confident enough to open a PR -> #464
However I thing a few additional pairs of eyes would be necessary here. I am also not quite sure if this really is problem of Hono, or more in vertx-proton. However the PR fixes the issue for Hono.
I just saw the Travis CI tests fail now. I will look into this, however I cannot reproduce that locally yet.
Ok, I think I did understand that problem. The order updating the latch and firing the connect event was in the wrong order. If the latch is updated first, then the unit test will continue while the mock connection factory still is handling the event (updating the connection in the client).
I updated the PR to take care of this.
Letting this run for a while longer, this really looks good. It's all green now ... green is good 😀
Travis is green as well after the amended PR.
So calling disconnect() and then isDisconnected() should still return false, as the transport still is non-null.
This looks like a bug in ProtonConnectionImpl
. doesn't it? Shouldn't the disconnect
method also set transport to null
like fireDisconnect
does?
I agree -> https://github.com/vert-x3/vertx-proton/issues/64
To be fair, I don't know what the actual solution should look like. But I would expect that isDisconnected()
returns true
after the call to disconnect()
.
@sophokles73 I agree. Just commented on that issue
@ctron can we close this issue?
I think so. From a Hono perspective we have a solution.
When testing Hono I ran into the following scenario, the message flow works fine until the internal token expires. This will cause the connection to the adapter be terminated. The service should re-connect automatically but this doesn't seem to work. Instead I do see the following messages in the log file:
The location which seems to map is in
org.eclipse.hono.adapter.http.AbstractVertxBasedHttpProtocolAdapter.doUploadMessage(RoutingContext, String, String, Buffer, String, Future<MessageSender>, String)
on line 494. A message ofnull
most often indicates a null pointer exception.