import pandas as pd
import numpy as np
pd.DataFrame({"a":["abc", np.nan, "def"]}).to_parquet("somewhere.parquet")
in Julia on the master branch
pf = ParFile("somewhere")
# the file is very small so only one rowgroup
col_chunks = columns(pf, 1)
colnum = 1
col_chunk=col_chunks[colnum]
correct_vals = tbl[colnum]
coltype = eltype(correct_vals)
vals_from_file = values(pf, col_chunk)
and you will see vals_from_file[1] are Int32 instead of Vector{UInt8}.
In Python
in Julia on the master branch
and you will see
vals_from_file[1]
areInt32
instead ofVector{UInt8}
.The same data can be read in R and Python