segmentio / parquet-go

Go library to read/write Parquet files
https://pkg.go.dev/github.com/segmentio/parquet-go
Apache License 2.0
341 stars 102 forks source link

Fix #501 #503

Closed joe-elliott closed 1 year ago

joe-elliott commented 1 year ago

This PR aims to fix #501. I'm not 100% confident in this fix, but based on my understanding of the code in question this is valid.

The issue appears to be this line here:

https://github.com/segmentio/parquet-go/blob/main/column_buffer.go#L851

The code builds a bitmask and assigns the value to b to set relevant booleans in the current bit. Then it |=s the bitmask against the current bit. Interestingly using |= guarantees that this path can never drop an existing bit flag to 0 and whatever happened to be leftover in the buffer persists.

Currently running the Tempo test suite against this.

joe-elliott commented 1 year ago

The Tempo test suite has passed successfully. That doesn't mean this is correct, but it does exercise the parquet-go library in a lot of different ways