kdgregory / log4j-aws-appenders

Appenders for Log4J 1.2.x, Log4J 2.x, and Logback that write to AWS destinations.
Apache License 2.0
67 stars 18 forks source link

Unable to use CloudWatch Appender from within AWS EMR Cluster #163

Closed JustAnotherContributor closed 1 year ago

JustAnotherContributor commented 1 year ago

When I run my Java app locally, the CloudWatch Appender works fine but when I upload my JAR to an EMR cluster the Spark step is unable to find/load the class. I have the following packages specified as dependencies within my app's POM and have also tried specifying them using the --packages option to the spark submit command, in the case of the latter I can see the spark step downloading the packages and their dependencies and uploading them to the driver and executors so they definitely should be available in the classpath. Any assistance would be appreciated as I suspect I'm overlooking something, most likely simple.

java.lang.ClassNotFoundException: com.kdgregory.log4j.aws.CloudWatchAppender
    at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:264)
    at org.apache.log4j.helpers.Loader.loadClass(Loader.java:198)
    at org.apache.log4j.xml.DOMConfigurator.parseAppender(DOMConfigurator.java:247)
    at org.apache.log4j.xml.DOMConfigurator.findAppenderByName(DOMConfigurator.java:176)
    at org.apache.log4j.xml.DOMConfigurator.findAppenderByReference(DOMConfigurator.java:191)
    at org.apache.log4j.xml.DOMConfigurator.parseAppender(DOMConfigurator.java:284)
    at org.apache.log4j.xml.DOMConfigurator.findAppenderByName(DOMConfigurator.java:176)
    at org.apache.log4j.xml.DOMConfigurator.findAppenderByReference(DOMConfigurator.java:191)
    at org.apache.log4j.xml.DOMConfigurator.parseChildrenOfLoggerElement(DOMConfigurator.java:523)
    at org.apache.log4j.xml.DOMConfigurator.parseRoot(DOMConfigurator.java:492)
    at org.apache.log4j.xml.DOMConfigurator.parse(DOMConfigurator.java:1006)
    at org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java:872)
    at org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java:778)
    at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
    at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
    at org.slf4j.impl.Log4jLoggerFactory.<init>(Log4jLoggerFactory.java:66)
    at org.slf4j.impl.StaticLoggerBinder.<init>(StaticLoggerBinder.java:72)
    at org.slf4j.impl.StaticLoggerBinder.<clinit>(StaticLoggerBinder.java:45)
    at org.apache.spark.internal.Logging$.org$apache$spark$internal$Logging$$isLog4j12(Logging.scala:222)
    at org.apache.spark.internal.Logging.initializeLogging(Logging.scala:127)
    at org.apache.spark.internal.Logging.initializeLogIfNecessary(Logging.scala:111)
    at org.apache.spark.internal.Logging.initializeLogIfNecessary$(Logging.scala:105)
    at org.apache.spark.deploy.yarn.ApplicationMaster$.initializeLogIfNecessary(ApplicationMaster.scala:838)
    at org.apache.spark.internal.Logging.initializeLogIfNecessary(Logging.scala:102)
    at org.apache.spark.internal.Logging.initializeLogIfNecessary$(Logging.scala:101)
    at org.apache.spark.deploy.yarn.ApplicationMaster$.initializeLogIfNecessary(ApplicationMaster.scala:838)
    at org.apache.spark.internal.Logging.log(Logging.scala:49)
    at org.apache.spark.internal.Logging.log$(Logging.scala:47)
    at org.apache.spark.deploy.yarn.ApplicationMaster$.log(ApplicationMaster.scala:838)
    at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:853)
    at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
JustAnotherContributor commented 1 year ago

As I'm using EMR 6.4.0, I also tried the following dependencies but it didn't help:

kdgregory commented 1 year ago

@JustAnotherContributor - sorry, I can't provide much help with this, as it's an issue with application packaging, and not the library itself.

I will note that the Spark documentation says to package any dependencies in an "uber" JAR (https://spark.apache.org/docs/latest/submitting-applications.html), so I would start there.

JustAnotherContributor commented 1 year ago

I tried many different ways of getting this to work (I was already using an "uber" JAR with the appenders JAR and its dependencies specified in my app's POM) and, while not ideal, the only working solution I could find was to programmatically add the appender within the EMR step's Java code:

import com.kdgregory.log4j.aws.CloudWatchAppender;
import org.apache.log4j.PatternLayout;
import org.apache.log4j.Logger;
...
CloudWatchAppender cloudwatch = new CloudWatchAppender();
cloudwatch.setLogGroup("{env:APP_NAME}-{sysprop:deployment:dev}");
cloudwatch.setLogStream("{hostname}-{startupTimestamp}");
cloudwatch.setDedicatedWriter(true);
cloudwatch.setLayout(new PatternLayout("%d [%t] %-5p - %c - %m%n"));
cloudwatch.activateOptions();
Logger.getRootLogger().addAppender(cloudwatch);
JustAnotherContributor commented 1 year ago

For anyone that comes across this issue, I was able to get things working by adding a bootstrap script with the following to the EMR cluster:

echo "installing cloudwatch log4jv1 appender"
sudo mkdir -p /usr/lib/spark/jars/
for artifact in "aws-facade-v1" "log4j1-aws-appenders" "logwriters"; do sudo wget https://repo1.maven.org/maven2/com/kdgregory/logging/$artifact/3.0.1/$artifact-3.0.1.jar -O /usr/lib/spark/jars/$artifact-3.0.1.jar; done

Then the method I was using to provide the log4j.xml (or log4j.properties) file works:

--files,s3a://<bucket>/<folder>/log4j.xml,--conf,spark.driver.extraJavaOptions=-Dlog4j.configuration=file:log4j.xml,--conf,spark.executor.extraJavaOptions=-Dlog4j.configuration=file:log4j.xml

Note: the com.amazonaws:aws-java-sdk-log artifact is not required as it already exists on the EMR cluster nodes.