awslabs / aws-glue-libs

AWS Glue Libraries are additions and enhancements to Spark for ETL operations.
Other
635 stars 300 forks source link

Log4j security vulnerability #109

Open vishalkc opened 2 years ago

vishalkc commented 2 years ago

I checked the log4j version used in the apache spark distribution (eg. https://aws-glue-etl-artifacts.s3.amazonaws.com/glue-3.0/spark-3.1.1-amzn-0-bin-3.2.1-amzn-3.tgz) and it looks like it is referencing jar files log4j-2.4.1.jar, log4j-core-2.4.1.jar 1.2.17 and log4j-api-2.4.1.jar 1.2.17. Considering CVE-2021-44228, is there a plan to upgrade the spark distribution to use log4j version 2.15.0?

vfrank66 commented 2 years ago

Same we have IT security all over us about this being present on local machines

MuggleHerder commented 2 years ago

2.17.1 needs to be the version updated too as well from the original comment

moomindani commented 2 years ago

Hi, thanks for reporting this issue.

We have already upgraded for all Glue dependencies and mitigated the issue identified in the CVE. Before the fix, the older log4j2 version might have come in the pom.xml file. If you still see the older log4j2 versions, please try cleaning up the local cache with mvn clean.

reference:

vfrank66 commented 2 years ago

I am using glue-2.0 tag and mvn clean was not the solution, it continued to maintain the older version. I add to update the pom.xml file. Not sure why this is closed, was only the latest glue version patched?

  <dependencies>
    <dependency>
      <groupId>com.amazonaws</groupId>
      <artifactId>AWSGlueETL</artifactId>
      <version>${project.version}</version>
    </dependency>
    <dependency>
      <groupId>org.apache.logging.log4j</groupId>
      <artifactId>log4j-core</artifactId>
    <version>2.17.1</version>
    </dependency>
  </dependencies>
moomindani commented 2 years ago

Thanks for trying it. Could you also try cleaning up jarsv1 directory?

I have tried checking out glue-2.0 branch, ran rm -rf jarsv1/, ran ./bin/gluepyspark to trigger maven, and confirmed that jarsv1 directory has only log4j 2.17.1 (the latest version) and 1.2.17 (coming from Spark's dependency, not impacted by the CVE).

MuggleHerder commented 2 years ago

This worked for me thanks @moomindani

MuggleHerder commented 2 years ago

Is there a plan to update 1.2.X? Per CVE-2021-4104, organizations should upgrade to a current version of Log4j 2. Log4j 1.x is end of life (as of 2015) and contains additional security vulnerabilities.

Per RedHat guidance, if an upgrade to a current version of Log4 2.x is not possible, these vulnerabilities can be mitigated by removing the JMSAppender class file from the classpath. ( if that isn't used)

Workaround : zip -q -d log4j-*.jar org/apache/log4j/net/JMSAppender.class

moomindani commented 2 years ago

Log4j 1.2.17 is coming from Spark's dependency, not from this library. Spark community upgraded log4j 1.2.17 to log4j 2.x in this JIRA. https://issues.apache.org/jira/browse/SPARK-6305 Spark 3.3.0 will be the first release which upgraded to log4j 2.x.

Currently the latest version of Spark supported in Glue platform is Spark 3.1.1. In future, when Spark 3.3.0 or later is supported in Glue platform, this will be achieved as well.