Closed joe-elliott closed 2 years ago
So we recently upgraded our parquet-go dependency: https://github.com/grafana/tempo/pull/1776
The upgrade was from: f812768dfa62 -> 948ea8c65f19
This issue is new as of this upgrade. Also interestingly this issue has diminished: https://github.com/segmentio/parquet-go/issues/332
suggesting a possible relationship between the two?
Thanks for reporting, I'll look into it this today!
I believe the issue might have been introduced by https://github.com/segmentio/parquet-go/pull/341
Prior to this change, the same condition of c.maxRepetitionLevel > 0
was used to test if the repetition levels must be decoded, and if a repeated page must be created.
After the change, the first condition was changed to test whether the repetition level is empty.
In the file you have, there may be empty repetition levels with a max repetition level value greater than zero for the column?
I'm not sure how that's possible, it would seem like an invalid condition for a parquet file, but maybe this happens under some conditions.
Are you able to open the file with a program like the standard parquet-tools
?
Ok, so by commenting out some code I have determined that this issue and #332 are both caused by *bufio.Reader
pooling. This change makes both problems disappear:
func getBufioReader(r io.Reader, bufferSize int) (*bufio.Reader, *sync.Pool) {
// pool := getBufioReaderPool(bufferSize)
// rbuf, _ := pool.Get().(*bufio.Reader)
// if rbuf == nil {
rbuf := bufio.NewReaderSize(r, bufferSize)
// } else {
// rbuf.Reset(r)
// }
return rbuf, nil
}
A quick review of the code suggests this could be caused by incorrectly using the filePages
object after Close()
is called (if the timing is just right). Unsure if these issues are a misuse in our code or a bug in the way the bufio.Readers()
are pooled in parquet-go
. Still digging.
This issue was created by a bad return from Tempo. Linked above but I'll include it here as well:
https://github.com/grafana/tempo/pull/1797
If anyone else is seeing similar issues, I would recommend checking the above PR.
Seeing nil pointer panics in
decodeDataPage()
with some regularity:Seems to be tripping on this:
Reviewing
decodeDataPageV2
there are definitely code paths where bothdefinitionLevels
andrepititionLevels
are not set, but I don't understand the implications of this or how to handle scenarios where they are not.