aws-samples / aws-glue-samples

AWS Glue code samples
MIT No Attribution
1.43k stars 818 forks source link

[#115] resolves Spark-UI docker container startup issue in glue 3 #117

Closed SercanKaraoglu closed 2 years ago

SercanKaraoglu commented 2 years ago

*Issue #115

Description of changes: Discrepancy between provided aws sdk dependency from spark 3.1 and the one in the maven is causing the reported issue stack trace:

Exception in thread "main" java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:300)
    at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)
Caused by: java.lang.NoSuchFieldError: SERVICE_ID
    at com.amazonaws.services.s3.AmazonS3Client.createRequest(AmazonS3Client.java:4772)
    at com.amazonaws.services.s3.AmazonS3Client.createRequest(AmazonS3Client.java:4758)
    at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1434)
    at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1374)
    at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExists$1(S3AFileSystem.java:381)
    at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
    at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:265)
    at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:322)
    at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:261)
    at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:236)
    at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:380)
    at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:314)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3303)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
    at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:116)
    at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:88)

to check the provided dependency and sdk run the following;

docker run -it -v ${PWD}:/root/glue glue/sparkui:latest bash
cd /root/glue/
mvn dependency:tree -Dincludes=com.amazonaws
[INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ HadoopForGlueSparkHistoryServer ---
[INFO] com.amazonaws:HadoopForGlueSparkHistoryServer:jar:2.0-SNAPSHOT
[INFO] +- org.apache.hadoop:hadoop-aws:jar:3.2.1:provided
[INFO] |  \- com.amazonaws:aws-java-sdk-bundle:jar:1.11.375:provided
[INFO] +- com.amazonaws:aws-java-sdk-core:jar:1.11.901:compile
[INFO] \- com.amazonaws:aws-java-sdk-s3:jar:1.11.901:compile
[INFO]    +- com.amazonaws:aws-java-sdk-kms:jar:1.11.901:compile
[INFO]    \- com.amazonaws:jmespath-java:jar:1.11.901:compile

After aligning the versions container starts successfully

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

SercanKaraoglu commented 2 years ago

@jpeddicord @hyandell @abhi1993

moomindani commented 2 years ago

We are sorry for being late on the reply. I have merged another pull request (newer version) to fix this issue. https://github.com/aws-samples/aws-glue-samples/pull/116 Could you please try to refresh your local copy and try again with the current version?