spark-redshift-community / spark-redshift

Performant Redshift data source for Apache Spark
Apache License 2.0
137 stars 63 forks source link

Support parquet as tempformat for AWS EMRFS Optimised Committer #60

Open fd8s0 opened 4 years ago

fd8s0 commented 4 years ago

Using this in EMR can occasionally lead to errors on the copy command and also perform worse than necessary because EMR have only enabled their optimised committer for parquet writing.

https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark-committer-reqs.html

Supporting parquet as tempformat should fix this.