timveil / hive-jdbc-uber-jar

Hive JDBC "uber" or "standalone" jar based on the latest Apache Hive version
268 stars 95 forks source link

Log4J Security Question #28

Closed rbeauchemin closed 2 years ago

rbeauchemin commented 2 years ago

Hey Tim,

I want to make sure Log4J, if included here, is not in versions mentioned in CVE-2021-44228 but have had trouble tracking down dependencies in the final jar. I saw in a previous Issue that you removed log4j, but in the readme it seems to suggest log4j is included. Could you comment?

rbeauchemin commented 2 years ago

I see it is in your exclusions in pom.xml, but just double checking that I'm reading that right and it is excluded.

timveil commented 2 years ago

Thats correct... one of my initial reasons for creating this project/jar was all of the problematic logging dependencies that were shaded (or not shaded) into the hive standalone or related jars. Since the beginning I have worked to exclude any logging dependencies from the transitive list and selectively shaded only the SLF4J api into this project. And as luck would have it I've also used explicit <include> tags in the pom instead of <exclude>. This means anything NOT listed below is NOT included. As it relates to the log4j vulnerability you can thus be assured it is not part of the final uber jar...

<includes>
    <include>com.google.guava:guava</include>
    <include>commons-codec:commons-codec</include>
    <include>commons-lang:commons-lang</include>
    <include>commons-configuration:commons-configuration</include>
    <include>commons-collections:commons-collections</include>
    <include>org.apache.curator:curator-client</include>
    <include>org.apache.curator:curator-framework</include>
    <include>org.apache.hadoop:hadoop-common</include>
    <include>org.apache.hadoop:hadoop-auth</include>
    <include>org.apache.hive:hive-classification</include>
    <include>org.apache.hive:hive-storage-api</include>
    <include>org.apache.hive:hive-common</include>
    <include>org.apache.hive:hive-jdbc</include>
    <include>org.apache.hive:hive-metastore</include>
    <include>org.apache.hive:hive-service</include>
    <include>org.apache.hive:hive-service-rpc</include>
    <include>org.apache.hive:hive-serde</include>
    <include>org.apache.hive:hive-shims</include>
    <include>org.apache.hive.shims:*</include>
    <include>org.apache.hive:hive-upgrade-acid</include>
    <include>org.apache.httpcomponents:httpclient</include>
    <include>org.apache.httpcomponents:httpcore</include>
    <include>org.apache.thrift:libthrift</include>
    <include>org.apache.zookeeper:zookeeper</include>
    <include>org.slf4j:slf4j-api</include>
    <include>org.slf4j:jcl-over-slf4j</include>
</includes>
rbeauchemin commented 2 years ago

Thanks Tim!

ni-shant commented 2 years ago

Hey @timveil I just ran the maven dependency command on this jar and I was able to see that log4j dependency is still being pulled in. So wanted to double check with you.

>> mvn dependency:tree | grep log4j
[INFO] |  |  +- org.apache.logging.log4j:log4j-1.2-api:jar:2.10.0:compile
[INFO] |  |  |  +- org.apache.logging.log4j:log4j-api:jar:2.10.0:compile
[INFO] |  |  |  \- org.apache.logging.log4j:log4j-core:jar:2.10.0:compile
[INFO] |  |  +- org.apache.logging.log4j:log4j-web:jar:2.10.0:compile
[INFO] |  |  +- org.apache.logging.log4j:log4j-slf4j-impl:jar:2.10.0:compile
timveil commented 2 years ago

@ni-shant keep in mind what this project does... It uses maven to pull the full transitive dependency graph and then selectively shades only certain jars into the "uber jar". Because I use <include> tags when using the maven shade plugin, only the specified dependencies are included in the final output. This means that while it does appear log4j is still a transitive dependency of hive-jdbc (due to package renaming) it is not included in the final uber jar.

Given the sensitivity of the above mentioned CVE, i'm going to exclude the following dependencies to avoid any further confusion....

ni-shant commented 2 years ago

Thanks @timveil!