apache / parquet-format

Apache Parquet Format
https://parquet.apache.org/
Apache License 2.0
1.81k stars 432 forks source link

GH-455: Add Variant specification docs #456

Closed gene-db closed 1 month ago

gene-db commented 1 month ago

Rationale for this change

Spark and Parquet communities have agreed to move the Spark Variant spec to Parquet.

What changes are included in this PR?

Added the Variant specification docs.

Do these changes have PoC implementations?

Closes #455

sfc-gh-aixu commented 1 month ago

+1. Thanks @gene-db to work on it. So we will include preliminary shredding spec as well? I'm fine with that.

gene-db commented 1 month ago

@rdblue I updated the PR to add licenses to the docs. I think that should make the tests pass.

gene-db commented 1 month ago

@julienledem Thanks! I clarified some of the comments, and I will address them in a followup PR.

alamb commented 3 days ago

Does anyone know of parquet implementations that implement the variant type?

I would like to try and organize getting this into the Rust implementation (see https://github.com/apache/arrow-rs/issues/6736) but I couldn't find any example data / implementations while writing that up