Open cpainsight opened 2 years ago
I can't replicate your exact msg but I did find a bug. So I've fixed it. I have push to cran so should be available soon
I would encourage you to try out {arrow} instead of {disk.frame} and you can convert disk.frame parquet to be used in arrow with the following function.
disk.frame::disk.frame_to_parquet()
Still getting the same error message, but now it adds a warning message:
Error in tools::file_path_as_absolute(attr(df, "path")) :
file 'C:/Users/Ricardo Torres/OneDrive/FreeAgent Drive Back-up/My Documents/IHA/Radiation Therapy & Cancer Institute/2021/Cancer Centers/mmm/Claims/Claims.df/' does not exist
In addition: Warning message:
In collect.summarized_disk.frame(.) :
These columns that appear in the group-by and summarise does not appear in the original data set: sum, y. This set of action is too hard for disk.frame to figure out the srckeep
automatically, you must do the srckeep
manually.
what does dir.exists("C:/Users/Ricardo Torres/OneDrive/FreeAgent Drive Back-up/My Documents/IHA/Radiation Therapy & Cancer Institute/2021/Cancer Centers/mmm/Claims/Claims.df/")
say?
are you able to move it off onedrive and try? I wonder if there's some issues with that.
dir.exists("C:/Users/Ricardo Torres/OneDrive/FreeAgent Drive Back-up/My Documents/IHA/Radiation Therapy & Cancer Institute/2021/Cancer Centers/mmm/Claims/Claims.df/") [1] TRUE
I suspect this is either a Base issue or an issue with OneDrive.
Have you tested moving the data off OneDrive and testing there?
Just tested moving it off OneDrive and it worked. You are right, the issue is with OneDrive, I wonder what it is. All other projects work fine with files located on OD, including this one until two or three weeks ago.
tools::file_path_as_absolute(attr(df
Maybe the above function is doing something wrong as well, so could be. A base r problem
Hi:
I'm new to R and disk.frame. I began using the package 6 months ago to process a 10GB CSV file. It has worked perfectly, but about a week ago the referred error message started to appear and I'm no longer able to run my code as I used to do. Nothing in my code has changed, and I have been running it for a while without any issues. Here is part of my code:
Setting up the CPU for parallel processing
setup_disk.frame() options(future.globals.maxSize = Inf)
Conversion of CSV file into disk frame to allow for parallel processing
csv_to_disk.frame(infile = "C:/Users/Ricardo Torres/OneDrive/FreeAgent Drive Back-up/My Documents/IHA/Radiation Therapy & Cancer Institute/2021/Cancer Centers/mmm/Claims/Claims2.csv", outdir = "C:/Users/Ricardo Torres/OneDrive/FreeAgent Drive Back-up/My Documents/IHA/Radiation Therapy & Cancer Institute/2021/Cancer Centers/mmm/Claims/Claims.df")
Loading the Disk Frame into a data object
utilization <- disk.frame("C:/Users/Ricardo Torres/OneDrive/FreeAgent Drive Back-up/My Documents/IHA/Radiation Therapy & Cancer Institute/2021/Cancer Centers/mmm/Claims/Claims.df/")
util_byyear <- utilization %>% srckeep(c('ToDate', 'Billed', 'Allowed', 'Deduct', 'Copay', 'Paid')) %>% mutate(y = lubridate::year(lubridate::mdy(ToDate))) %>% group_by(y) %>% summarize(billed = sum(Billed), allowed = sum(Allowed), deductibles = sum(Deduct), copays = sum(Copay), paid = sum(Paid)) %>% collect()
When I run this last part of the code, I'm getting the following error:
Error in tools::file_path_as_absolute(attr(df, "path")) : file 'C:/Users/Ricardo Torres/OneDrive/FreeAgent Drive Back-up/My Documents/IHA/Radiation Therapy & Cancer Institute/2021/Cancer Centers/mmm/Claims/Claims.df/' does not exist