segmentio / parquet-go

Go library to read/write Parquet files
https://pkg.go.dev/github.com/segmentio/parquet-go
Apache License 2.0
341 stars 58 forks source link

Opening file with no rows causes panic #405

Closed bprosnitz closed 1 year ago

bprosnitz commented 1 year ago

When opening a file with no rows, I see the following panic:

panic: runtime error: index out of range [0] with length 0

goroutine 9759 [running]:
github.com/segmentio/parquet-go.(*File).ReadPageIndex(0x4ce8540?)
        file.go:177 +0x705
github.com/segmentio/parquet-go.OpenFile({0x4cc8d40?, 0xc0050541e0}, 0x10c4, {0x0, 0x0, 0x0})
        file.go:84 +0x4dc
github.com/segmentio/parquet-go.openFile({0x4cc8d40?, 0xc0050541e0?})
       reader.go:99 +0x7a
github.com/segmentio/parquet-go.NewGenericReader[...]({0x4cc8d40, 0xc0050541e0}, {0x0, 0xc001311bc0?, 0x4e416d?})
        reader_go18.go:35 +0x85

This line contains the following: columnIndexOffset := f.metadata.RowGroups[0].Columns[0].ColumnIndexOffset

achille-roussel commented 1 year ago

Hello @bprosnitz and thanks for reporting!

Im curious to hear why you closed the issue, were you able to find a workaround?

bprosnitz commented 1 year ago

Hi!

Sorry, I shouldn't have closed the issue -- I didn't have a simple test example at the time and wanted to confirm.

I created a PR that adds a testcase file with an example of this behavior and a small fix -- see https://github.com/segmentio/parquet-go/pull/408