databricks / spark-redshift

Redshift data source for Apache Spark
Apache License 2.0
606 stars 348 forks source link

CSV file format while writing data to redsfhit. #419

Open vishooo opened 5 years ago

vishooo commented 5 years ago

Hi I am trying to use below code dataFrame.write.format("com.databricks.spark.redshift").option("url", url).option("tempdir", tempDir).option("tempFormat", "CSV").option("dbtable","test_csv1").option("aws_iam_role", iam).mode('overwrite').save()

But it is still storing data in avro format in s3 temp location. I am suing spark-redshift 2.11 connector

_Originally posted by @vishooo in https://github.com/databricks/spark-redshift/issue_comments#issuecomment-451117579_

agnarok commented 4 years ago

I'm having the same problem but using dataframe.read

ballcap231 commented 3 years ago

I believe you need spark 3.x in order to save tempdir in CSV format. Spark really should, at least, give an error when you try to specify an invalid tempdir file format. See this: https://github.com/databricks/spark-redshift/issues/308