Closed codefromthecrypt closed 4 years ago
I think this issue is more related to hypertrace-service
. These are all startup exceptions fromhypertrace-service
, and due to the way it is composed. These startup error logs are,
exception while getting a view from zookeeper
hypertrace | 2020-10-15 14:29:30.522 [main] WARN o.a.p.c.ExternalViewReader - Exception while reading External view from zookeeper
hypertrace | java.lang.NullPointerException: null
hypertrace | at org.apache.pinot.client.ExternalViewReader.unpackZnodeIfNecessary(ExternalViewReader.java:126) ~[pinot-java-client-0.5.0.jar:0.5.0-d87bbc9032c6efe626eb5f9ef1db4de7aa067179]
hypertrace | at org.apache.pinot.client.ExternalViewReader.getTableToBrokersMap(ExternalViewReader.java:91) ~[pinot-java-client-0.5.0.jar:0.5.0-d87bbc9032c6efe626eb5f9ef1db4de7aa067179]
hypertrace | at org.apache.pinot.client.DynamicBrokerSelector.refresh(DynamicBrokerSelector.java:57) ~[pinot-java-client-0.5.0.jar:0.5.0-d87bbc9032c6efe626eb5f9ef1db4de7aa067179]
hypertrace | at org.apache.pinot.client.DynamicBrokerSelector.<init>(DynamicBrokerSelector.java:53) ~[pinot-java-client-0.5.0.jar:0.5.0-d87bbc9032c6efe626eb5f9ef1db4de7aa067179]
hypertrace | at org.apache.pinot.client.ConnectionFactory.fromZookeeper(ConnectionFactory.java:44) ~[pinot-java-client-0.5.0.jar:0.5.0-d87bbc9032c6efe626eb5f9ef1db4de7aa067179]
hypertrace | at org.hypertrace.core.query.service.pinot.PinotClientFactory$PinotClient.<init>(PinotClientFactory.java:71) ~[query-service-impl.jar:?]
hypertrace | at org.hypertrace.core.query.service.pinot.PinotClientFactory.createPinotClient(PinotClientFactory.java:32) ~[query-service-impl.jar:?]
hypertrace | at org.hypertrace.core.query.service.pinot.PinotRequestHandlerBuilder.build(PinotRequestHandlerBuilder.java:35) ~[query-service-impl.jar:?]
hypertrace | at org.hypertrace.core.query.service.RequestHandlerRegistry.lambda$buildFromMatchingHandler$2(RequestHandlerRegistry.java:35) ~[query-service-impl.jar:?]
span request exception (which causes above Pinot exception)
^[[36mhypertrace |^[[0m 2020-10-14 11:07:16.755 [grpc-default-executor-0] ERROR o.h.g.s.GatewayServiceImpl - Error while handling spans request: start_time_millis: 1602673625897
^[[36mhypertrace |^[[0m end_time_millis: 1602673635897
^[[36mhypertrace |^[[0m selection {
^[[36mhypertrace |^[[0m columnIdentifier {
^[[36mhypertrace |^[[0m columnName: "EVENT.id"
^[[36mhypertrace |^[[0m }
^[[36mhypertrace |^[[0m }
^[[36mhypertrace |^[[0m limit: 1
^[[36mhypertrace |^[[0m
^[[36mhypertrace |^[[0m io.grpc.StatusRuntimeException: INTERNAL: org.apache.pinot.client.PinotClientException: Could not find broker to query for table: null
^[[36mhypertrace |^[[0m java.lang.RuntimeException: org.apache.pinot.client.PinotClientException: Could not find broker to query for table: null
^[[36mhypertrace |^[[0m at org.hypertrace.core.query.service.pinot.PinotBasedRequestHandler.handleRequest(PinotBasedRequestHandler.java:351)
^[[36mhypertrace |^[[0m at org.hypertrace.core.query.service.QueryServiceImpl.lambda$executeTransformedRequest$2(QueryServiceImpl.java:61)
^[[36mhypertrace |^[[0m at io.reactivex.rxjava3.internal.operators.mixed.MaybeFlatMapObservable$FlatMapObserver.onSuccess(MaybeFlatMapObservable.java:102)
The exception (2) is logged at multiple layers as well, first at the query-service layer, and then at gateway-service layer.
The main reasoning behind these error logs as we are starting hypertrace-service
and pinot
parallel. The internal query-service is dependent on pinot
which tries to connect to pinot during startup via pinot-client. This connection is zookeeper based which spit out a set of warning messages as shown in exception (1) - ref (https://github.com/apache/incubator-pinot/blob/master/pinot-clients/pinot-java-client/src/main/java/org/apache/pinot/client/ExternalViewReader.java) as a broker is not yet up and registered with zookeeper.
Once the pinot
is up,hypertrace-service
immediately test for span request. Though a broker is up, it has not yet registered with zookeeper paths, and so we are getting error exception (2)
As of now, at the individual GRPC service layer, the exception response is wrapped via responseOberver.onError (https://github.com/hypertrace/gateway-service/blob/main/gateway-service-impl/src/main/java/org/hypertrace/gateway/service/GatewayServiceImpl.java) and logged too. Here, we have an opportunity to improve with status codes, etc. what are other suggestions here?
For now, I think we can avoid start time error logs occurring before the full stack up to avoid confusion. we can address that as - https://github.com/hypertrace/hypertrace-service/pull/39
This PR - https://github.com/hypertrace/hypertrace-service/pull/39 - fixes this issue.
verified the fix - https://github.com/hypertrace/hypertrace-service/pull/39, don't see above exceptions on startup. Closing this.
When there's no data, it should be handled gracefully or some sort of error code. Thanks @kishoreg for noticing