databricks / spark-avro

Avro Data Source for Apache Spark
http://databricks.com/
Apache License 2.0
539 stars 310 forks source link

spark-avro ignores the compression option in DataFrameWriter #259

Open juarezr opened 6 years ago

juarezr commented 6 years ago

Documentation says that should set compression in SparkConf:

spark.conf.set("spark.sql.avro.compression.codec", "deflate")

But others formats like parquet allows setting it in DataFrameWriter options:

DataFrameWriter writer = rowDataset.write() .format("com.databricks.spark.avro") .option("compression","snappy") .save(path);

For consistency, spark-avro could also respect this setting.