Closed hangxie closed 1 year ago
v1.6.2 still incorrect.
type AllTypes struct {
F1 string `parquet:"name=f1, type=BYTE_ARRAY, convertedtype=UTF8, encoding=PLAIN_DICTIONARY"`
F2 string `parquet:"name=f2, type=BYTE_ARRAY, convertedtype=UTF8, encoding=PLAIN_DICTIONARY"`
}
func TestClient_HandleParquet(t *testing.T) {
fw, err := local.NewLocalFileWriter("all-nil.parquet")
if err != nil {
fmt.Println("Can't create local file", err)
return
}
pw, err := writer.NewParquetWriter(fw, new(AllTypes), 4)
if err != nil {
fmt.Println("Can't create parquet writer", err)
return
}
for i := 0; i < 10; i++ {
value := &AllTypes{
F1: fmt.Sprintf("f%d", i),
F2: fmt.Sprintf("f%d", i),
}
if err = pw.Write(value); err != nil {
fmt.Println("Write error", err)
}
}
if err = pw.WriteStop(); err != nil {
fmt.Println("WriteStop error", err)
return
}
fw.Close()
}
@hangxie do something pls
https://github.com/xitongsys/parquet-go/releases/tag/v1.6.2 was release almost 2 years ago, try head of master
https://github.com/xitongsys/parquet-go/releases/tag/v1.6.2 was release almost 2 years ago, try head of master
this version still have this problem
this version still have this problem
I don't know what your problem is - this issue is about incorrect null_count and since your code insert no null value, the parquet file I got reports zero null values which is the right behavior.
Feel free to open a new issue if you believe there is a problem, with a minimized sample code and expected output.
parquet-go seems to write wrong null count if all values for a field are null.
Generate
all-nil.parquet
with this program:I'm expecting 10 null values but both parquet-cli (https://github.com/apache/parquet-mr/blob/master/parquet-cli/README.md) and my parquet-tools (https://github.com/hangxie/parquet-tools) returns 6, tested a couple of other cases: