NVIDIA / spark-rapids

Spark RAPIDS plugin - accelerate Apache Spark with GPUs
https://nvidia.github.io/spark-rapids
Apache License 2.0
822 stars 235 forks source link

[BUG] GPU file writes only test writing a single row group or stripe #11735

Open jlowe opened 2 days ago

jlowe commented 2 days ago

We recently ran into rapidsai/cudf#6763 which triggers when trying to write booleans with nulls. We test writing booleans to ORC in our integration tests, but those tests did not trigger the issue. They missed it because they only write a single stripe to each file, because so few rows are written. If the test had written enough rows to trigger more than one stripe, the bug would have been caught.