Closed abraithwaite closed 1 year ago
Some progress debugging this. I know the root cause is due to the optional tag, and I can see the way the node changes when that's enabled as demonstrated here:
(dlv) p name
"phoneNumbers"
(dlv) call node.GoType().Elem().Kind()
> github.com/segmentio/parquet-go.makeNodeOf() ./schema.go:857 (PC: 0x102f08e5c)
Values returned:
~r0: String (24)
852: panic("unhandled nested slice on parquet schema without list tag")
853: }
854: }
855:
856: if optional {
=> 857: node = Optional(node)
858: }
859:
860: return node
861: }
862:
(dlv) n
> github.com/segmentio/parquet-go.makeNodeOf() ./schema.go:860 (PC: 0x102f08e78)
855:
856: if optional {
857: node = Optional(node)
858: }
859:
=> 860: return node
861: }
862:
863: func forEachTagOption(tags []string, do func(option, args string)) {
864: for _, tag := range tags {
865: _, tag = split(tag) // skip the field name
(dlv) call node.GoType().Elem().Kind()
> github.com/segmentio/parquet-go.makeNodeOf() ./schema.go:860 (PC: 0x102f08e78)
Values returned:
~r0: Slice (23)
855:
856: if optional {
857: node = Optional(node)
858: }
859:
=> 860: return node
861: }
That Kind
is ultimately what drives the logic in the Construct functions in makeValue
, which is where the panic is being thrown:
https://github.com/segmentio/parquet-go/blob/5bd5f6114638a749b9326aaf4ea5a6ea90cc9cf4/value.go#L203-L206 ... skipped code, same function: ... https://github.com/segmentio/parquet-go/blob/5bd5f6114638a749b9326aaf4ea5a6ea90cc9cf4/value.go#L260-L268
Ultimately, we want v.Type()
to be string
here, not []string
.
I get that Optional()
wraps the GoType()
in a reflect.PtrTo
here, called from schema.go:857
during schema construction.
https://github.com/segmentio/parquet-go/blob/5bd5f6114638a749b9326aaf4ea5a6ea90cc9cf4/node.go#L157
However, removing that reflection wrapper in optional.GoType()
doesn't seem to affect the behavior of the code. That is, we still see []string
in makeValue
when we're expecting just string
.
I'm currently looking for where the value being passed into makeValue
is being unwrapped and/or set in the first place.
Simple scenario where deconstruct/reconstruct panics unexpectedly. This should be a valid configuration according to the Parquet spec, but our implementation does not support optional repeated elements strongly yet.