awslabs / aws-glue-libs

AWS Glue Libraries are additions and enhancements to Spark for ETL operations.
Other
647 stars 304 forks source link

Unable to Create GlueContext via GlueContext Function in Local Python/awsglue Environment #58

Closed adamfortuno closed 4 years ago

adamfortuno commented 4 years ago

I'm having the issue described in issue #42.

I am attempting to run the following in my local PySpark console...

from awsglue.context import GlueContext
glueContext = GlueContext(sc)

We receive the following:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\XYZ\bin\aws-glue-libs\PyGlue.zip\awsglue\context.py", line 47, in __init__
  File "C:\Users\XYZ\bin\aws-glue-libs\PyGlue.zip\awsglue\context.py", line 68, in _get_glue_scala_context
TypeError: 'JavaPackage' object is not callable

The following is the complete picture: image

The environment looks like the following:

My environment variables look like the following...

Just to confirm which version awsglue repo I'm working with...

image

The following are the "netty" files in my ..\aws-glue-libs\jarsv1\:

image

I'm looking for a little guidance on how to tweak my configuration to resolve this issue.

calleo commented 4 years ago

Typically this is due to an misconfigured classpath. Check if you have the AWS jars on the classpath.

adamfortuno commented 4 years ago

The driver and executor settings were missing from my spark-defaults.conf file. I made the following additions:

spark.driver.extraClassPath         C://Users/XYZ/bin/aws-glue-libs/jarsv1/*
spark.executor.extraClassPath       C://Users/XYZ/bin/aws-glue-libs/jarsv1/*

Everything worked as expected.