NVIDIA / spark-rapids

Spark RAPIDS plugin - accelerate Apache Spark with GPUs
https://nvidia.github.io/spark-rapids
Apache License 2.0
758 stars 224 forks source link

[FEA] Test V2 Parquet encoded files with reader #10835

Open abellina opened 1 month ago

abellina commented 1 month ago

We have an issue https://github.com/NVIDIA/spark-rapids/issues/9058 to enable parquet writes in V2 format. We would like to also test the reader, and test combinations of GPU/CPU encoding and decoding v2 (potentially also fastparquet if it supports it)

V2 parquet encoded files should just work with the reader, but we have NOT tested it, so we are not documenting support for that as of today.

mattahrens commented 1 month ago

Scope is to add additional tests for creating V2 parquet files and testing reads/writes.