Open rkrug opened 7 months ago
Thanks @rkrug, I was able to reproduce the error with the data you provided. I'll have a look soon and report back.
any updates on this?
I am having a possibly related problem when trying to save a data.table to a parquet file
library(tidyverse)
library(data.table)
library(arrow) # version 17.0.0.1
options(arrow.use_dt = TRUE)
# 1m sample
d[1:(10^6)] %>% write_dataset('C:/test1')
t1 <- open_dataset('C:/test1') %>% collect() #
class(t1)
"data.table" "data.frame"
# 10m sample
d[1:(10*10^6)] %>% write_dataset('C:/test2')
class(t2)
"data.table" "data.frame"
# 20m sample
d[1:(20*10^6)] %>% write_dataset('C:/test3')
class(t3)
"data.table" "data.frame"
#Full data (270m obs)
d %>% write_dataset('C:/test4')
Warnings
1: Invalid metadata$r
2: Invalid metadata$r
3: Invalid metadata$r
class(t4)
"tbl_df" "tbl" "data.frame"
The weird thing is that when I feed smaller samples of the data the parquet file is saved without warnings and the "open_dataset > collect" operation returns a data.table as expected.
However, when I feed the full dataset(270m), there are 3 "Invalid metadata$r" warnings and the "open_dataset > collect" returns a " "tbl_df" "tbl" "data.frame" " object
Describe the bug, including details regarding any error messages, version, and platform.
Hi I have a parquet file (https://www.dropbox.com/scl/fi/lsg2xxe565dfa88e9plo4/part-0.parquet?rlkey=3w2sjc6xewaz9lxd4cwcvf65b&dl=0) which is causing an
Invalid metadata$r
warning. It seems to be working fine, but the warning is annoying.The file is written from R as part of a partitioning database, and the error occurs with others as well. Please find the code and the link to the file at the end.
The Parquet file can be downloaded from: https://www.dropbox.com/scl/fi/lsg2xxe565dfa88e9plo4/part-0.parquet?rlkey=3w2sjc6xewaz9lxd4cwcvf65b&dl=0
Component(s)
R