CODAIT / stocator

Stocator is high performing connector to object storage for Apache Spark, achieving performance by leveraging object storage semantics.
Apache License 2.0
113 stars 72 forks source link

Spark append-mode throws FileAlreadyExistsException #190

Closed RoeeShlomo closed 6 years ago

RoeeShlomo commented 6 years ago

https://github.com/CODAIT/stocator/commit/35d7818c5dd0e3f2ea13d4bc3338aaa14c3bff31 introduced a bug in append mode. When writing in append mode the operation always fails with FileAlreadyExistsException:

Name: org.apache.hadoop.fs.FileAlreadyExistsException
Message: mkdir on existing directory cos://...

Steps to reproduce:

spark.range(10).write.mode("append").parquet("cos://yourbucket/yourobject")
spark.range(10).write.mode("append").parquet("cos://yourbucket/yourobject")
gilv commented 6 years ago

This is already resolved in the master branch. Fix will be released with 1.0.20