apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
6.04k stars 1.14k forks source link

Support AVRO Format for Write Queries #7679

Open devinjdangelo opened 1 year ago

devinjdangelo commented 1 year ago

Is your feature request related to a problem or challenge?

Writing Avro files is currently not supported in DataFusion, such as via COPYor INSERT INTO queries. This would be a nice feature to have and recently came up in the discord channel.

Describe the solution you'd like

I expect some of the common functions in https://github.com/apache/arrow-datafusion/blob/main/datafusion/core/src/datasource/file_format/write.rs can be reused in the avro implementation.

Describe alternatives you've considered

Don't support writing avro files

Additional context

No response

Veeupup commented 10 months ago

@alamb Hi, I can help with this ticket : )

alamb commented 10 months ago

@alamb Hi, I can help with this ticket : )

Thanks @Veeupup -- for this one the first thing we need to do is get an avro writer and the second part is hooking it into DataFusion. There is some work upstream in arrow-rs to make an avro reader/writer in https://github.com/apache/arrow-rs/issues/4886 but I think that project is stilled at the moment