redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.65k stars 589 forks source link

parquet: introduce column writer #24020

Closed rockwotj closed 1 week ago

rockwotj commented 1 week ago

Note that there are currently no tests for the column writer. This sucks but there is not an external go library that directly exposes the ability to read/write pages so it'd be hard to test. There will be test coverage when we write full parquet files - that's kind of the best I can do for now :shrug:

Thankfully, the code there is not too complex so I feel pretty good about only being able to test indirectly.

Backports Required

Release Notes

vbotbuildovich commented 1 week ago

the below tests from https://buildkite.com/redpanda/redpanda/builds/57640#0192fe73-e768-4a1a-a396-af22239dfd3e have failed and will be retried

translator_test_rpfixture

the below tests from https://buildkite.com/redpanda/redpanda/builds/57647#0192ff4a-6908-4502-a2c4-b373107c9df2 have failed and will be retried

catalog_schema_manager_rpunit