Closed vedantroy closed 1 year ago
I have the same issue, but after 26215 rows.
Looks like issue #471 is similar?
I create a (large) file using something like
fh, err := os.Create(fname)
...
writer = parquet.NewGenericWriter[OutType](fh)
nrows := 1000000
for i:=0; i<nrows; i++ {
...
writer.Write(toWrite)
}
writer.Close()
then, later, I read using
rows, err := parquet.ReadFile[OutType](fh)
...
assert.Equal(t, len(rows), nrows)
len(rows)
comes out to 26215 on my macbook pro whereas nrows
is (obvs) 1000000.
I have the same issue. Should read 9'000'000 + rows, but only get 131'000
Is the reading stopping when going OOM? Without saying?
This file (zipped parquet file): debug.zip has more than 1024 rows (around 9K).
But if I read it using the following code:
and print out the number of rows, I only see that there are 1024 rows. Tools like the
pandas
library from Python don't have this issue.