spark-redshift-community / spark-redshift

Performant Redshift data source for Apache Spark
Apache License 2.0
136 stars 62 forks source link

unload failing if set to use java.time.instant in Spark 3.x #94

Open honeybadgerdoesntcare opened 3 years ago

honeybadgerdoesntcare commented 3 years ago

With Spark 3.x it's recommend to use java.time.instant over java.sql.timestamp . (reference: https://databricks.com/blog/2020/07/22/a-comprehensive-look-at-dates-and-timestamps-in-apache-spark-3-0.html) spark.conf.set("spark.sql.datetime.java8API.enabled", "true")

however with that turned on, the upload step fails, most likely because the spark-redshift lib still uses java.sql.timestamp related formats https://github.com/spark-redshift-community/spark-redshift/blob/master/src/main/scala/io/github/spark_redshift_community/spark/redshift/RedshiftWriter.scala#L240

eager to this improved and aligned with spark 3.x releases.

jsleight commented 3 years ago

Makes sense to me. Thanks for providing links 😄

Do you have interest in submitting a PR? I'm happy to review and help get this merged/released.

honeybadgerdoesntcare commented 3 years ago

@jsleight pleasure. a quick one: https://github.com/spark-redshift-community/spark-redshift/pull/95