apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.26k stars 1.23k forks source link

pinot-clients have too many dependencies #11507

Open cyrilou242 opened 10 months ago

cyrilou242 commented 10 months ago

Problem

Since 0.11.0, pinot-jdbc-client and pinot pinot-java-client have a dependency on pinot-common. See introductions: https://github.com/apache/pinot/commit/15e939818ca913bd9a5f0162300057c1a8b20f39 https://github.com/apache/pinot/commit/23a81d07b52dad6181b696562e07cdfa0932d191

This introduces the following new dependencies to the clients:

\- org.apache.pinot:pinot-common:jar:0.12.1:compile
   +- org.apache.httpcomponents:httpmime:jar:4.5.13:compile
   +- org.antlr:antlr4-runtime:jar:4.6:compile
   +- org.apache.thrift:libthrift:jar:0.15.0:compile
   |  \- javax.annotation:javax.annotation-api:jar:1.3.2:compile
   +- org.xerial.snappy:snappy-java:jar:1.1.8.2:compile
   +- com.github.luben:zstd-jni:jar:1.5.2-3:compile
   +- org.lz4:lz4-java:jar:1.8.0:compile
   +- org.apache.logging.log4j:log4j-slf4j-impl:jar:2.17.1:compile
   |  \- org.apache.logging.log4j:log4j-core:jar:2.17.1:runtime
   +- commons-httpclient:commons-httpclient:jar:3.1:compile
   +- it.unimi.dsi:fastutil:jar:8.2.3:compile
   +- org.webjars:swagger-ui:jar:3.23.11:compile
   +- io.grpc:grpc-netty-shaded:jar:1.41.0:compile
   |  +- io.perfmark:perfmark-api:jar:0.23.0:runtime
   |  \- io.grpc:grpc-core:jar:1.41.0:compile
   |     +- com.google.code.gson:gson:jar:2.8.6:runtime
   |     +- com.google.android:annotations:jar:4.1.1.4:runtime
   |     \- org.codehaus.mojo:animal-sniffer-annotations:jar:1.19:runtime
   +- io.grpc:grpc-protobuf:jar:1.41.0:compile
   |  +- io.grpc:grpc-api:jar:1.41.0:compile
   |  |  \- io.grpc:grpc-context:jar:1.41.0:compile
   |  +- com.google.api.grpc:proto-google-common-protos:jar:2.0.1:compile
   |  \- io.grpc:grpc-protobuf-lite:jar:1.41.0:compile
   +- io.grpc:grpc-stub:jar:1.41.0:compile
   +- org.apache.yetus:audience-annotations:jar:0.13.0:compile
   +- org.mindrot:jbcrypt:jar:0.4:compile
   \- com.github.seancfoley:ipaddress:jar:5.3.4:compile

This is a lot of new, unused dependencies for the client. Also notice that an implementation of log4j is included org.apache.logging.log4j:log4j-slf4j-impl:jar:2.17.1:compile, which is bad practice but easy to quickfix with an exclusion in the pom.

Then for pinot-jdbc-client a dependency on pinot-core was introduced: https://github.com/apache/pinot/commit/caf8d755820d6bf27bc7daeb74ac35d2e70caa61

This introduces the following dependencies in the client

+- org.apache.pinot:pinot-jdbc-client:jar:0.12.1:compile
|  +- org.apache.pinot:pinot-core:jar:0.12.1:compile
|  |  +- com.uber:h3:jar:4.0.0:compile
|  |  +- org.roaringbitmap:RoaringBitmap:jar:0.9.35:compile
|  |  |  \- org.roaringbitmap:shims:jar:0.9.35:runtime
|  |  +- org.apache.pinot:pinot-segment-spi:jar:0.12.1:compile
|  |  +- org.apache.pinot:pinot-segment-local:jar:0.12.1:compile
|  |  +- io.netty:netty-transport-native-epoll:jar:linux-x86_64:4.1.79.Final:compile
|  |  |  +- io.netty:netty-common:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-buffer:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-transport:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-transport-native-unix-common:jar:4.1.79.Final:compile
|  |  |  \- io.netty:netty-transport-classes-epoll:jar:4.1.79.Final:compile
|  |  +- io.netty:netty-transport-native-kqueue:jar:osx-x86_64:4.1.79.Final:compile
|  |  |  \- io.netty:netty-transport-classes-kqueue:jar:4.1.79.Final:compile
|  |  +- io.netty:netty-tcnative-boringssl-static:jar:linux-x86_64:2.0.53.Final:compile
|  |  |  +- io.netty:netty-tcnative-classes:jar:2.0.53.Final:compile
|  |  |  +- io.netty:netty-tcnative-boringssl-static:jar:linux-aarch_64:2.0.53.Final:compile
|  |  |  +- io.netty:netty-tcnative-boringssl-static:jar:osx-aarch_64:2.0.53.Final:compile
|  |  |  \- io.netty:netty-tcnative-boringssl-static:jar:windows-x86_64:2.0.53.Final:compile
|  |  +- io.netty:netty-tcnative-boringssl-static:jar:osx-x86_64:2.0.53.Final:compile
|  |  +- io.netty:netty-all:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-codec:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-codec-dns:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-codec-haproxy:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-codec-http2:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-codec-memcache:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-codec-mqtt:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-codec-redis:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-codec-smtp:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-codec-stomp:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-codec-xml:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-resolver:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-resolver-dns:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-transport-rxtx:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-transport-sctp:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-transport-udt:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-resolver-dns-classes-macos:jar:4.1.79.Final:compile
|  |  |  +- io.netty:netty-transport-native-epoll:jar:linux-aarch_64:4.1.79.Final:runtime
|  |  |  +- io.netty:netty-transport-native-kqueue:jar:osx-aarch_64:4.1.79.Final:runtime
|  |  |  +- io.netty:netty-resolver-dns-native-macos:jar:osx-x86_64:4.1.79.Final:runtime
|  |  |  \- io.netty:netty-resolver-dns-native-macos:jar:osx-aarch_64:4.1.79.Final:runtime
|  |  +- com.clearspring.analytics:stream:jar:2.7.0:compile
|  |  +- org.apache.datasketches:datasketches-java:jar:1.2.0-incubating:compile
|  |  |  \- org.apache.datasketches:datasketches-memory:jar:1.2.0-incubating:compile
|  |  +- com.tdunning:t-digest:jar:3.2:compile
|  |  +- org.xerial.larray:larray-mmap:jar:0.4.1:compile
|  |  |  \- org.xerial.larray:larray-buffer:jar:0.4.1:compile
|  |  +- net.sf.jopt-simple:jopt-simple:jar:4.6:compile
|  |  +- org.glassfish.jersey.containers:jersey-container-grizzly2-http:jar:2.35:compile
|  |  |  \- org.glassfish.hk2.external:jakarta.inject:jar:2.6.1:compile
|  |  +- org.glassfish.grizzly:grizzly-http-server:jar:2.4.4:compile
|  |  +- org.glassfish.hk2:hk2-locator:jar:2.6.1:compile
|  |  +- org.apache.lucene:lucene-core:jar:8.2.0:compile
|  |  +- org.apache.lucene:lucene-queryparser:jar:8.2.0:compile
|  |  |  +- org.apache.lucene:lucene-queries:jar:8.2.0:compile
|  |  |  \- org.apache.lucene:lucene-sandbox:jar:8.2.0:compile
|  |  \- org.apache.lucene:lucene-analyzers-common:jar:8.2.0:compile
|  +- org.asynchttpclient:async-http-client:jar:2.12.3:compile
|  |  +- org.asynchttpclient:async-http-client-netty-utils:jar:2.12.3:compile
|  |  +- io.netty:netty-codec-http:jar:4.1.60.Final:compile
|  |  +- io.netty:netty-handler:jar:4.1.60.Final:compile
|  |  +- io.netty:netty-codec-socks:jar:4.1.60.Final:compile
|  |  +- io.netty:netty-handler-proxy:jar:4.1.60.Final:compile
|  |  +- org.reactivestreams:reactive-streams:jar:1.0.3:compile
|  |  +- com.typesafe.netty:netty-reactive-streams:jar:2.0.4:compile
|  |  \- com.sun.activation:jakarta.activation:jar:1.2.2:compile
|  \- com.101tec:zkclient:jar:0.7:compile

Impact: The pinot clients are too heavy and introduce too many unused dependencies that can conflict with the users's project. We are stuck on client 0.10.0.

Suggestion

From what I understand of the commits the goal of importing pinot-common and pinot-core was to get access to a few utils, and almost all of the dependencies are not used. I'd be nice to refactor dependencies to make the pinot clients light again.

abhioncbr commented 10 months ago

I can look into it, to make it slim. Thanks

xiangfu0 commented 10 months ago

Another approach is to remove the dependencies of pinot-common, only pinot-spi. This might take longer time to finish. We need to extract the SqlParser out from the pinot-common

xiangfu0 commented 9 months ago

@cyrilou242 I think for now you may need to explicitly exclude libs from pinot-jdbc-client

abhioncbr commented 9 months ago

Another approach is to remove the dependencies of pinot-common, only pinot-spi. This might take longer time to finish. We need to extract the SqlParser out from the pinot-common

I can start looking into this, if it's the approach we want to take.

xiangfu0 commented 9 months ago

Some improvement to remove pinot-core dependency from pinot-jdbc-client: https://github.com/apache/pinot/pull/11620