awslabs / aws-glue-libs

AWS Glue Libraries are additions and enhancements to Spark for ETL operations.
Other
636 stars 300 forks source link

Unable to Build AWSGlueETLPython 1.0.0 with maven when I run gluepyspark command on terminal. #52

Closed cploutarchou closed 4 years ago

cploutarchou commented 4 years ago

[WARNING] The POM for com.amazonaws:AWSSDKGlueJavaClient:jar:1.0 is missing, no dependency information available [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 1.509 s [INFO] Finished at: 2020-05-06T02:08:18+03:00 [INFO] ------------------------------------------------------------------------ [ERROR] Failed to execute goal on project AWSGlueETLPython: Could not resolve dependencies for project com.amazonaws:AWSGlueETLPython:jar:1.0.0: Failure to find com.amazonaws:AWSSDKGlueJavaClient:jar:1.0 in https://aws-glue-etl-artifacts.s3.amazonaws.com/release/ was cached in the local repository, resolution will not be reattempted until the update interval of aws-glue-etl-artifacts has elapsed or updates are forced -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException Screenshot at May 06 02-16-51

svajiraya commented 4 years ago

@cploutarchou You can try removing the ~/.m2 and aws-glue-libs/jars* directories to reset the build and re-run gluepyspark to start the build from scratch.

rm -rf ~/.m2
rm -rf aws-glue-libs/jars*

It's perfectly safe to delete ~/.m2 directory. Maven will just download the jars again (depending on the project, this may take some time).

cploutarchou commented 4 years ago

@cploutarchou You can try removing the ~/.m2 and aws-glue-libs/jars* directories to reset the build and re-run gluepyspark to start the build from scratch.

rm -rf ~/.m2
rm -rf aws-glue-libs/jars*

It's perfectly safe to delete ~/.m2 directory. Maven will just download the jars again (depending on the project, this may take some time).

Nothing Happen. The issue is still there after I clean all directories. AWSSDKGlueJavaClient cannot found for some reason.

cploutarchou commented 4 years ago

AWSSDKGlueJavaClient

Note: The issue is related to Glue version 1.0

GytisZ commented 4 years ago

Getting the same issue Failed to execute goal on project [36mAWSGlueETLPython[m: [1;31mCould not resolve dependencies for project com.amazonaws:AWSGlueETLPython:jar:1.0.0: Could not find artifact com.amazonaws:AWSSDKGlueJavaClient:jar:1.0 in aws-glue-etl-artifacts (https://aws-glue-etl-artifacts.s3.amazonaws.com/release/)

Upon manual inspection, it doesn't look like any AWSSDKGlueJavaClient packages are in the repo. The build worked yesterday, so I guess it was recently removed?

svajiraya commented 4 years ago

You're right!

Based on the looks of it, the pom file (https://aws-glue-etl-artifacts.s3.amazonaws.com/release/com/amazonaws/AWSGlueETL/1.0.0/AWSGlueETL-1.0.0.pom) was updated recently. The <properties> section in pom file is certainly different to what I have in my pom file (~/.m2/repository/com/amazonaws/AWSGlueETL/1.0.0/AWSGlueETL-1.0.0.pom).

I can see 3 new parameters:

    <glue.sdk.artifactid>AWSSDKGlueJavaClient</glue.sdk.artifactid>
    <glue.sdk.version>1.0</glue.sdk.version>
    <aws.sdk.version>1.11.774</aws.sdk.version>

and this is being used in:

    <dependency>
      <groupId>com.amazonaws</groupId>
      <artifactId>${glue.sdk.artifactid}</artifactId>
      <version>${glue.sdk.version}</version>
    </dependency>

and I believe this is causing the issue. I have started a new docker build process on my end to confirm this.

svajiraya commented 4 years ago

Update: I'm able to reproduce this error as well.

[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  24:43 min
[INFO] Finished at: 2020-05-06T09:13:23Z
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project AWSGlueETLPython: Could not resolve dependencies for project com.amazonaws:AWSGlueETLPython:jar:1.0.0: Could not find artifact com.amazonaws:AWSSDKGlueJavaClient:jar:1.0 in aws-glue-etl-artifacts (https://aws-glue-etl-artifacts.s3.amazonaws.com/release/) -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
The command '/bin/sh -c mvn -f /glue/pom.xml -DoutputDirectory=/glue/jarsv1 dependency:copy-dependencies' returned a non-zero code: 1
svajiraya commented 4 years ago

This issue should be resolved now. AWS Glue team has pushed a change to https://aws-glue-etl-artifacts.s3.amazonaws.com/release/com/amazonaws/AWSGlueETL/1.0.0/AWSGlueETL-1.0.0.pom which fixes this issue. Basically the pom file was referring to a non-existent dependency which caused build errors.

cploutarchou commented 4 years ago

This issue should be resolved now. AWS Glue team has pushed a change to https://aws-glue-etl-artifacts.s3.amazonaws.com/release/com/amazonaws/AWSGlueETL/1.0.0/AWSGlueETL-1.0.0.pom which fixes this issue. Basically the pom file was referring to a non-existent dependency which caused build errors.

Nice . Everything works us expected now and AWS-Glue runs on local env. Thanks @svajiraya

swap-lm10 commented 4 years ago

This has been fixed with an update to the maven libraries by the glue team. In case if the issue persists, you might need a force update of the maven libraries to fix this issue. One way to do a force update is to delete the locally cached copy of the AWSGlueETL library. For instance, in my mac book, the local maven repository is in the location $HOME/.m2 To delete the local cached version of the AWSGlueETL library from maven, delete the following path

rm -r $HOME/.m2/repository/com/amazonaws/AWSGlueETL

sshah90 commented 2 years ago

I am getting similar error while running ./bin/gluepyspark. I did try cleaning up .m2 directory but it didn't work.

Error Message :

[ERROR] Failed to execute goal on project AWSGlueETLPython: Could not resolve dependencies for project com.amazonaws:AWSGlueETLPython:jar:3.0.0: Could not find artifact jdk.tools:jdk.tools:jar:1.8 at specified path /Library/Internet Plug-Ins/JavaAppletPlugin.plugin/Contents/Home/../lib/tools.jar -> [Help 1]

any suggestions?

Aesthet commented 1 year ago

I am getting similar error while running ./bin/gluepyspark. I did try cleaning up .m2 directory but it didn't work.

Error Message :

[ERROR] Failed to execute goal on project AWSGlueETLPython: Could not resolve dependencies for project com.amazonaws:AWSGlueETLPython:jar:3.0.0: Could not find artifact jdk.tools:jdk.tools:jar:1.8 at specified path /Library/Internet Plug-Ins/JavaAppletPlugin.plugin/Contents/Home/../lib/tools.jar -> [Help 1]

any suggestions?

I faced same issue, were you able to overcome it?