segmentio / parquet-go

Go library to read/write Parquet files
https://pkg.go.dev/github.com/segmentio/parquet-go
Apache License 2.0
341 stars 102 forks source link

Don't panic if reconstructFuncOfRepeated runs out of values #470

Closed mdisibio closed 1 year ago

mdisibio commented 1 year ago

Fixes #460

We concluded that the first printout shown in https://github.com/segmentio/parquet-go/issues/460#issuecomment-1406931844 (single nil at r:0/d:0) is not valid. Because this is the only reported occurrence so far, gracefully handling and supporting reads of these files doesn't seem necessary. However, returning an error is better than panicking.

This PR submits a fix to return an error. It proved difficult to return a helpful error directly from this method because it is non-trivial to print the field's name. node.Fields() only contains the direct child nodes, and indexes no longer match 1-to-1 when there are grandchild (etc) fields. We would have to recurse through node.Fields().

Instead we continue and let the empty slice fail further downstream. This gives the same nice error as before: rs → ils → Spans → Links → no values found in parquet row for column 44