Open kherrera-ebsco opened 2 years ago
Did you find any solution for this?
I ran into the same issue running a Jupyter notebook locally (using AWS Glue version 4.0: amazon/aws-glue-libs:glue_libs_4.0.0_image_01
The cell in question is:
dyf = glueContext.create_dynamic_frame.from_catalog(database='new_db', \
table_name='my_table')
Some background:
enforceSSL
set to truesslmode
(e.g. jdbc:postgresql://blablabla.us-east-1.rds.amazonaws.com:5432/database
)The only way I've managed to get it to work is by passing additional_options
to override the connection's value to disable the enforcement of SSL. Like this:
dyf = glueContext.create_dynamic_frame.from_catalog(database='new_db', \
table_name='my_table', \
additional_options={"enforceSSL": "false"})
Lastly, I tried creating a spark dataframe (jdbc) directly as specified below. As you can see I'm setting sslmode
to do a CA verification.
This works ONLY if I set the RDS global-bundle cert in the /home/glue_user/.postgresql/root.crt
container path. This is the jdbc driver default location
jdbcDF = spark.read \
.format("jdbc") \
.option("url", "jjdbc:postgresql://blablabla.us-east-1.rds.amazonaws.com:5432/database?sslmode=verify-ca") \
.option("dbtable", "myschema.mytable") \
.option("user", "test") \
.option("password", "test") \
.load()
jdbcDF.printSchema()
jdbcDF.count()
jdbcDF.show(10)
I guess what would solve the issue for Glue's dataframe from_catalog
is to be able to set the location of the CA certificate. Sadly, I haven't been able to find the right configuration parameters/settings to get it right.
Does anyboyd have any pointers?
Are certificates missing or outdated for the image? I am receiving the following error when using a Glue JDBC connection that has
enforceSSL
enabled.