awslabs / aws-glue-libs

AWS Glue Libraries are additions and enhancements to Spark for ETL operations.
Other
635 stars 299 forks source link

DynamicFrameWriter fails to parse jdbc URL query params #183

Open jon-laf-nexient-numo opened 1 year ago

jon-laf-nexient-numo commented 1 year ago

I encountered an error while using the Glue library's DynamicFrameWriter class.

Stacktrace fragment:

Exception info: ['Traceback (most recent call last):\n', '  File "/tmp/populate_temp_database.py", line 92, in read_and_write_and_log\n    write_table(gc, dyF, sink_table, args)\n', '  File "/tmp/localPyFiles-f85ec700-7bac-4fc3-849f-e355c2a14ac8/glue_utils.py", line 90, in write_table_from_jdbc_conf\n    glue_context.write_dynamic_frame.from_jdbc_conf(\n', '  File "/opt/amazon/lib/python3.7/site-packages/awsglue/dynamicframe.py", line 666, in from_jdbc_conf\n    return self._glue_context.write_dynamic_frame_from_jdbc_conf(frame,\n', '  File "/opt/amazon/lib/python3.7/site-packages/awsglue/context.py", line 419, in write_dynamic_frame_from_jdbc_conf\n    self.write_from_jdbc_conf(frame, catalog_connection, connection_options, redshift_tmp_dir, transformation_ctx,\n', '  File "/opt/amazon/lib/python3.7/site-packages/awsglue/context.py", line 436, in write_from_jdbc_conf\n    return DataSink(j_sink, self).write(frame_or_dfc)\n', '  File "/opt/amazon/lib/python3.7/site-packages/awsglue/data_sink.py", line 39, in write\n    return self.writeFrame(dynamic_frame_or_dfc, info)\n', '  File "/opt/amazon/lib/python3.7/site-packages/awsglue/data_sink.py", line 32, in writeFrame\n    return DynamicFrame(self._jsink.pyWriteDynamicFrame(dynamic_frame._jdf, callsite(), info), dynamic_frame.glue_ctx, dynamic_frame.name + "_errors")\n', '  File "/opt/amazon/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1321, in __call__\n    return_value = get_return_value(\n', '  File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 190, in deco\n    return f(*a, **kw)\n', '  File "/opt/amazon/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/protocol.py", line 326, in get_return_value\n    raise Py4JJavaError(\n', "py4j.protocol.Py4JJavaError: An error occurred while calling o122.pyWriteDynamicFrame.\n: java.sql.SQLException: The connection property 'sslMode' acceptable values are: 'DISABLED', 'PREFERRED', 'REQUIRED', 'VERIFY_CA' or 'VERIFY_IDENTITY'. The value 'REQUIRED?useCursorFetch=true' is not acceptable.\n\tat 

Note the exception at the bottom of this Stacktrace excerpt:

The connection property 'sslMode' acceptable values are: 'DISABLED', 'PREFERRED', 'REQUIRED', 'VERIFY_CA' or 'VERIFY_IDENTITY'. The value 'REQUIRED?useCursorFetch=true' is not acceptable

It appears that Glue appends the query parameter ?useCursorFetch=true under the hood, but fails to parse the query parameter ?sslMode=REQUIRED that was part of the sink jdbc url passed to connection_options argument in the method GlueContext.write_dynamic_frame.from_jdbc_conf.