Open mikemherron opened 3 weeks ago
Updating the repro to use parquet.NewGenericWriter
rather than parquet.NewWriter
solves the issue of the struct values being changed after the write, but they still have the embedded nil
s changed to the zero-value on read.
I've created a test for the issue here, I'll look in to it a bit and see if I can make some progress towards a fix.
OK, I think the values are correctly written as nil
, but turn back to the zero value on read. I traced it down to these lines in schema.go - if I comment out the following code my test case passes:
func fieldByIndex(v reflect.Value, index []int) reflect.Value {
for _, i := range index {
if v = v.Field(i); v.Kind() == reflect.Ptr || v.Kind() == reflect.Interface {
- if v.IsNil() {
- v.Set(reflect.New(v.Type().Elem()))
- v = v.Elem()
- break
- } else {
- v = v.Elem()
- }
+ //if v.IsNil() {
+ // v.Set(reflect.New(v.Type().Elem()))
+ // v = v.Elem()
+ // break
+ //} else {
+ // v = v.Elem()
+ //}
}
}
return v
It looks like this code is intentionally setting the nil pointer to the zero value 🤔 Pointers on the top-level struct don't go down this code path at all, I think due to the if
statement here. I don't have any prior knowledge of the code so I'm probably getting to the limit of what I can contribute towards a fix - I'll leave this for now and hopefully someone with more context can pick this up from here :)
Hello @mikemherron
I would try to run the test suite with the code change you made, if none of the tests break then the behavior wasn't tested and it seems fine to change.
If some of the tests break, then it's worth digging a bit and figure out how to retain the behavior while fixing the bug.
I would encourage you to submit a pull request with your code snippet converted to a test so we can at least validate that the issue is being fixed and that we won't introduce regressions in the future.
Thanks a lot for your contribution!
I think there may be an issue when writing structs with embedded pointer values. It seems like on write (and then read) any embedded pointers will have values of
nil
replaced with the zero-value for the pointer type (I've only tested withint
andstring
but assume this to be the case with other types). Pointers that are defined on the top-level struct don't seem to have this issue. While trying to debug this, I also noticed that the struct passed towriter.Write
actually has the embedded pointer values mutated to the zero-value, which I'm assuming is not intended.Both issues can be seen in the below repro:
This outputs:
It's the "both should be nil" test case that is interesting here - you can see in the original output, both embedded pointers were
nil
. In the final output, they are0
and""
(empty string). You can also see after performing the write, the embedded pointer values have changed already.