apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.53k stars 1.3k forks source link

Mulitstage queries doesn't run locally #11076

Closed tibrewalpratik17 closed 1 year ago

tibrewalpratik17 commented 1 year ago

I am trying to run multistage queries locally but always ends up in this error:

java.io.IOException: Failed : HTTP error code : 500. Root Cause: <html><head><title>Grizzly 2.4.4</title><style><!--div.header {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#003300;font-size:22px;-moz-border-radius-topleft: 10px;border-top-left-radius: 10px;-moz-border-radius-topright: 10px;border-top-right-radius: 10px;padding-left: 5px}div.body {font-family:Tahoma,Arial,sans-serif;color:black;background-color:#FFFFCC;font-size:16px;padding-top:10px;padding-bottom:10px;padding-left:10px}div.footer {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#666633;font-size:14px;-moz-border-radius-bottomleft: 10px;border-bottom-left-radius: 10px;-moz-border-radius-bottomright: 10px;border-bottom-right-radius: 10px;padding-left: 5px}BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;}B {font-family:Tahoma,Arial,sans-serif;color:black;}A {color : black;}HR {color : #999966;}--></style> </head><body><div class="header">Request failed.</div><div class="body">Request failed.</div><div class="footer">Grizzly 2.4.4</div></body></html>

This happens when i run the quick start script for colocated_join / multi_stage and try to run this query: select * from userAttributes limit 10 option(useMultistageEngine=true)

The query works fine without the multistage query option as it uses Http calls b/w broker and server. I understand this is happening during grpc call between broker and server and the call never reaches from broker -> server. It happens exactly at this point of code - https://github.com/apache/pinot/blob/9bb3cbb3d6ebf5a9fcd8a5410a5794aee02c7ee4/pinot-query-runtime/src/main/java/org/apache/pinot/query/service/dispatch/DispatchClient.java#L57

I have tried this out in multiple laptops by cloning and building pinot and trying to run multistage queries but everytime it gets stuck here. Do we need to additionally install anything to get it working?

Full stack trace - https://gist.github.com/tibrewalpratik17/41e49d1769b957ab5e4c8100d27647bb

Jackie-Jiang commented 1 year ago

@walterddr Can you help take a look?

walterddr commented 1 year ago

Yes I see this in production as well. This is due to the recent dependency version upgrade somewhere introduced a GRPC API version that's for some reason packaged together with the distribution and incompatible with our explicit dependency. Thus when GRPC channel creation occurs it will throw error.

Can you share the EXACT command you used to build Pinot? B/C I was unable to reproduce in Intellij. And possibly run a JFR recording when this throwable occurs to see the actual stack trace

walterddr commented 1 year ago

Also note we need to run mutilestage in quick start test which is run on a packaged release instead of mvn test (which doesn't run assembly)

tibrewalpratik17 commented 1 year ago

Can you share the EXACT command you used to build Pinot?

I used mvn clean install -DskipTests -Pbin-dist to build Pinot. My maven version:

> mvn --version
Maven home: /opt/homebrew/Cellar/maven/3.9.3/libexec
Java version: 11.0.19, vendor: Eclipse Adoptium, runtime: /Library/Java/JavaVirtualMachines/temurin-11.jdk/Contents/Home
Default locale: en_IN, platform encoding: UTF-8
OS name: "mac os x", version: "13.4.1", arch: "aarch64", family: "mac"

Yes I see this in production as well. This is due to the recent dependency version upgrade somewhere introduced a GRPC API version that's for some reason packaged together with the distribution and incompatible with our explicit dependency. Thus when GRPC channel creation occurs it will throw error.

Based on the stack trace it's happening at this point - https://github.com/grpc/grpc-java/blob/4fa2814d65b2536aede30e1f24c461a2f42be1f7/api/src/main/java/io/grpc/NameResolver.java#L415

It says java.lang.NoSuchMethodError: 'io.grpc.NameResolver$Args$Builder io.grpc.NameResolver$Args$Builder.setOverrideAuthority(java.lang.String)'

Also note we need to run mutilestage in quick start test which is run on a packaged release instead of mvn test (which doesn't run assembly)

I tried using both quickstart and mvn test, got the same error:

xiangfu0 commented 1 year ago

https://github.com/apache/pinot/pull/11086

tibrewalpratik17 commented 1 year ago

@xiangfu0 this change might not be related to this issue as it seems this is happening in grpc-1.53.0 package. Plus i never pulled #11074 in my local as it's very recent.

xiangfu0 commented 1 year ago

hmm, I've explicitly build current master then revert #11074 and it fixed the problem...

xiangfu0 commented 1 year ago

maybe try to delete ~/.m2/repository/ cache then do mvn clean install -Pbin-dist -DskipTests -T1C again ?

tibrewalpratik17 commented 1 year ago

hmm, I've explicitly build current master then revert https://github.com/apache/pinot/pull/11074 and it fixed the problem... maybe try to delete ~/.m2/repository/ cache then do mvn clean install -Pbin-dist -DskipTests -T1C again ?

@xiangfu0 i did this.. doesn't seem to work even now. Getting the same error.

I think the issue is setOverrideAuthority is tagged as ExperimentalAPI in grpc package and is not getting resolved during runtime. More details in this comment -- https://github.com/apache/pinot/issues/11076#issuecomment-1632033455

walterddr commented 1 year ago

i was able to confirmed that with the latest pinot master it is fixed which dependency did you find pulling in the experimental API? it seems like a GCP specific dependency that's happen to be in your execution classpath. i don't think pinot package that by default

tibrewalpratik17 commented 1 year ago

I am able to root-cause the issue. The issue is happening because of conflicting NameResolver class in com.google.android:annotations:4.1.1.4 (annotations-4.1.1.4.jar). Attached screenshot shows that the ManagedChannelImpl class is using the NameResolver from annotations-4.1.1.4.jar which doesn't have setOverrideAuthority method and ends up in NoSuchMethodError.

Screenshot 2023-07-13 at 11 58 31 PM
xiangfu0 commented 1 year ago

I am able to root-cause the issue. The issue is happening because of conflicting NameResolver class in com.google.android:annotations:4.1.1.4 (annotations-4.1.1.4.jar). Attached screenshot shows that the ManagedChannelImpl class is using the NameResolver from annotations-4.1.1.4.jar which doesn't have setOverrideAuthority method and ends up in NoSuchMethodError.

Screenshot 2023-07-13 at 11 58 31 PM

Thanks @tibrewalpratik17 ! Where is this com.google.android:annotations:4.1.1.4 (annotations-4.1.1.4.jar) got introduced? Shall we exclude them?

walterddr commented 1 year ago

Yeah I run dependency:tree it doesn't seem like it is there. Maybe I ran it wrong?

xiangfu0 commented 1 year ago

It comes from pinot-pulsar

[INFO] +- io.grpc:grpc-protobuf-lite:jar:1.19.0:compile
[INFO] |  +- io.grpc:grpc-core:jar:1.19.0:compile
[INFO] |  |  +- io.opencensus:opencensus-api:jar:0.19.2:compile
[INFO] |  |  \- io.opencensus:opencensus-contrib-grpc-metrics:jar:0.19.2:compile
[INFO] |  \- com.google.protobuf:protobuf-lite:jar:3.0.1:compile
walterddr commented 1 year ago

hmm. @tibrewalpratik17 the android annotation package was added so long ago (https://github.com/apache/pinot/commit/dafbef176f1f7517418b7b97a51b53b30455cfc8) that I don't think it will cause problem now.

btw @xiangfu0 the link you added points back to this issue :-P

tibrewalpratik17 commented 1 year ago

hmm. @tibrewalpratik17 the android annotation package was added so long ago (https://github.com/apache/pinot/commit/dafbef176f1f7517418b7b97a51b53b30455cfc8) that I don't think it will cause problem now.

Yeah but somehow this is happening in few of my colleagues laptops as well.

xiangfu0 commented 1 year ago

Tried multiple things, but no luck: https://github.com/apache/pinot/pull/11106

Not sure why current grpc-protobuf-lite works

tibrewalpratik17 commented 1 year ago

I tried using maven-central for clean-building my repo and it worked. Previously i was using our company's internal artifactory where i think the grpc packages are not proper.

Thanks for the help @walterddr @xiangfu0!