segmentio / parquet-go

Go library to read/write Parquet files
https://pkg.go.dev/github.com/segmentio/parquet-go
Apache License 2.0
341 stars 58 forks source link

panic: runtime error: slice bounds out of range [:1] with capacity 0 #460

Closed mdisibio closed 1 year ago

mdisibio commented 1 year ago

After upgrading library to the latest, getting a panic in reader.Read(). This occurs with both parquet.NewGenericReader and parquet.NewReader. Unfortunately I don't have a sanitized file that reproduces this that I can share.

$ go get "github.com/segmentio/parquet-go"                        
go: upgraded github.com/segmentio/parquet-go v0.0.0-20221209072905-710d87d50dd1 => v0.0.0-20221214174709-7a0ad59e0540
panic: runtime error: slice bounds out of range [:1] with capacity 0

goroutine 29 [running]:
github.com/segmentio/parquet-go.reconstructFuncOfRepeated.func1({0x1a09de0?, 0xc000330c20?, 0xc000330c00?}, {0x5e?, 0x0, 0xe?}, {0xc0003da030, 0x1d, 0x2?})
    /Users/marty/src/tempo/vendor/github.com/segmentio/parquet-go/row.go:641 +0x345
github.com/segmentio/parquet-go.reconstructFuncOfGroup.func1({0x1ab00c0?, 0xc000330c00?, 0xc00051b588?}, {0x97?, 0xf6?, 0x8f?}, {0xc0003da000, 0xc0000a9560?, 0x1f})
    /Users/marty/src/tempo/vendor/github.com/segmentio/parquet-go/row.go:788 +0x575
github.com/segmentio/parquet-go.reconstructFuncOfRepeated.func1({0x1a09da0?, 0xc0000a95d8?, 0xc0000a9560?}, {0x5e?, 0x0, 0xe?}, {0xc00026ef18, 0x1f, 0x11?})
    /Users/marty/src/tempo/vendor/github.com/segmentio/parquet-go/row.go:644 +0x203
github.com/segmentio/parquet-go.reconstructFuncOfGroup.func1({0x1aaff80?, 0xc0000a9560?, 0xc00051b7e0?}, {0x97?, 0xf6?, 0x8f?}, {0xc00026ed80, 0xc0004bc000?, 0x30})
    /Users/marty/src/tempo/vendor/github.com/segmentio/parquet-go/row.go:788 +0x575
github.com/segmentio/parquet-go.reconstructFuncOfRepeated.func1({0x1a09d60?, 0xc0004bc018?, 0xc0004bc000?}, {0x58?, 0x0, 0x51?}, {0xc0000c6b18, 0x30, 0x0?})
    /Users/marty/src/tempo/vendor/github.com/segmentio/parquet-go/row.go:644 +0x203
github.com/segmentio/parquet-go.reconstructFuncOfGroup.func1({0x1b3a140?, 0xc0004bc000?, 0xaa?}, {0x30?, 0xba?, 0x51?}, {0xc0000c6b00, 0x36?, 0x37})
    /Users/marty/src/tempo/vendor/github.com/segmentio/parquet-go/row.go:788 +0x575
github.com/segmentio/parquet-go.(*Schema).Reconstruct(0xc000048f60, {0x19f3fe0?, 0xc0004bc000?}, {0xc00031f000, 0xa6, 0xaa})
    /Users/marty/src/tempo/vendor/github.com/segmentio/parquet-go/schema.go:272 +0x29d
github.com/segmentio/parquet-go.(*Reader).Read(0xc0000a8510, {0x19f3fe0, 0xc0004bc000})
    /Users/marty/src/tempo/vendor/github.com/segmentio/parquet-go/reader.go:212 +0x246
github.com/grafana/tempo/tempodb/encoding/vparquet.findTraceByID({0x1d1f160, 0xc0001ec570}, {0xc000044280, 0x10, 0x20}, 0xc0003241a0, 0xc000164240)
    /Users/marty/src/tempo/tempodb/encoding/vparquet/block_findtracebyid.go:197 +0x908
github.com/grafana/tempo/tempodb/encoding/vparquet.(*backendBlock).FindTraceByID(0xc0002581e0, {0x1d1f0f0, 0xc00012a010}, {0xc000044280, 0x10, 0x20}, {0x400000, 0x0, 0x0, 0x0, ...})
    /Users/marty/src/tempo/tempodb/encoding/vparquet/block_findtracebyid.go:81 +0x4a5

Context We were originally investigating a different but possibly related error on the previous version 20221209072905-710d87d50dd1. When reading the same row we received:

rs → ils → Spans → Links → no values found in parquet row for column 44

but after the upgrade it yields a panic.

kevinburkesegment commented 1 year ago

I'll try to take a look at this one later today

joe-elliott commented 1 year ago

@kevinburkesegment, any progress on this one? Is there any info we can provide to help diagnose?

kevinburkesegment commented 1 year ago

@joe-elliott @mdisibio to be clear - is this an expected error? Did the row actually have zero values in this case?

rs → ils → Spans → Links → no values found in parquet row for column 44

If I am reading this right this is on the path from turning a Go struct into a Parquet file, so maybe the issue is we are building the Parquet file incorrectly?

Like it's easy enough to stop the panic by checking if there are zero values in the row but maybe that reflects an error higher up in processing where we are allowing the creation of a zero-value row.

joe-elliott commented 1 year ago

We received this error when we added the Links field to our schema while iterating old blocks that did not have the field.

The error was intermittent though. Most of the time we could iterate blocks using the ReadRows w/o issue, but sometimes it would fail with this error. Presumably none of the rows had the Links field, but it only seemed to matter occasionally. Marty is currently on PTO and will likely have more detailed info.

We most often saw this while iterating all rows in a parquet file using ReadRows. The stack trace above is a bit different because it is from a "trace by id" lookup. This is the only time we saw it in this context. @mdisibio is on PTO but will have more context when he returns.

kevinburkesegment commented 1 year ago

OK. Achille is also on PTO and would likely be able to solve this within a few minutes

kevinburkesegment commented 1 year ago

some discussion in the meeting - returning an error is also wrong. if the row had no values we'd expect it to be nil.

luistilingue commented 1 year ago

Hi. Any news?

mdisibio commented 1 year ago

I have found a difference in the values for the Links column between the older files that exhibit the error, and newer files that are ok, and also identified a potential fix.

parquet.Row printouts The below snippets were gathered by calling reader.ReadRows() and then printing array of values.

In the older file that exhibits the error, the trace contains 6 spans but there is only a single Links and DroppedLinksCount written for the entire trace. Columns immediately before and after contain 6 values like expected.

row[270] v: 0      r: 0 d: 3 c: 43 rs.ils.Spans.DroppedEventsCount
row[271] v: 0      r: 3 d: 3 c: 43 rs.ils.Spans.DroppedEventsCount
row[272] v: 0      r: 3 d: 3 c: 43 rs.ils.Spans.DroppedEventsCount
row[273] v: 0      r: 3 d: 3 c: 43 rs.ils.Spans.DroppedEventsCount
row[274] v: 0      r: 3 d: 3 c: 43 rs.ils.Spans.DroppedEventsCount
row[275] v: 0      r: 3 d: 3 c: 43 rs.ils.Spans.DroppedEventsCount
row[276] v: <null> r: 0 d: 0 c: 44 rs.ils.Spans.Links                       <--- here
row[277] v: <null> r: 0 d: 0 c: 45 rs.ils.Spans.DroppedLinksCount           <---
row[278] v: <null> r: 0 d: 3 c: 46 rs.ils.Spans.HttpMethod
row[279] v: <null> r: 3 d: 3 c: 46 rs.ils.Spans.HttpMethod
row[280] v: <null> r: 3 d: 3 c: 46 rs.ils.Spans.HttpMethod
row[281] v: <null> r: 3 d: 3 c: 46 rs.ils.Spans.HttpMethod
row[282] v: <null> r: 3 d: 3 c: 46 rs.ils.Spans.HttpMethod
row[283] v: GET    r: 3 d: 4 c: 46 rs.ils.Spans.HttpMethod

In newer written files there is an empty slice for each Span. This example is from 2 empty Spans:

row[66] v: 0      r: 0 d: 3 c: 43 rs.ils.Spans.DroppedEventsCount
row[67] v: 0      r: 3 d: 3 c: 43 rs.ils.Spans.DroppedEventsCount
row[68] v:        r: 0 d: 3 c: 44 rs.ils.Spans.Links
row[69] v:        r: 3 d: 3 c: 44 rs.ils.Spans.Links
row[70] v: 0      r: 0 d: 3 c: 45 rs.ils.Spans.DroppedLinksCount
row[71] v: 0      r: 3 d: 3 c: 45 rs.ils.Spans.DroppedLinksCount
row[72] v: <null> r: 0 d: 3 c: 46 rs.ils.Spans.HttpMethod
row[73] v: <null> r: 3 d: 3 c: 46 rs.ils.Spans.HttpMethod

Theory I believe in reconstructFuncOfRepeated, it does not account for the single null defined at a lower definition level. k starts at 1 and the expectation is that at least 1 value exists for every leaf (the second printout) that is consumed by each reconstruction call.

This doesn't seem to be related to schema conversion, since the Links and DroppedLinksCount columns are present in both files.

Fix Totally unsure how appropriate this is, but amending the logic to substitute a null when value missing appears to work ok.

    for i := 0; i < n; i++ {
          for j, column := range values {
               column = column[:cap(column)]

               if len(column) == 0 {
                    // This means this column was defined null at a higher definition level,
                    // so substitute a single null value here and continue
                    values[j] = append(values[j], Value{})
                    continue
                }

                k := 1
                ...