Closed tdlan80 closed 1 year ago
Hi Thilina -
I tried a test with a few years of Pinus strobus data (38K records of Status and Intensity) and it read in fine, with the base R read.csv function, so I think it's to do with the massive file/memory?
test <- (read.csv("PinusStrobusData.csv"))
strangely, data.table::read.csv
worked although it took time. i also ran data.table::fread
as well-- first it failed due to lack of virtual memory, but ran under a minute once I cleaned the memory gc()
I originally didn't run read.csv
since that function is not verbose at all. in future, i might have to chunk the csv file before running it.
I downloaded status and intensity data for 2009-2021 period from NPN web portal and I am having trouble reading it into R. I first tried:
library(readr)
eastUS <- read_csv(file = "webPortalData/status_intensity_observation_data.csv")
The script runs and get stuck for hrs... R does not become non-responsive but makes no progress even after
I also tried several other reading functions
library(vroom) eastUS <- vroom(file = "webPortalData/status_intensity_observation_data.csv")
still, no success in reading itthe progress bar is stuck in
vroom
is better at reading large datasets thanread_csv
. These functions are working fine otherwise for other csv files, it is just the portal downloaded version that gives me this issue. I am trying to compare if I have the same data from both the web portal download and thernpn
package pull. that's why I need this odd way of getting the status and intensity data.do you recommend a better package for reading the web portal data into R? I am not sure if this issue is due to large file size/memory or issue with reading the csv (some odd field separation issue).