Open timothy-barry opened 2 years ago
After a bit of searching through the issues on this repo, I noticed that at least one other person seems to be encountering this issue as well: https://github.com/tidyverse/readr/issues/1120#issuecomment-1055255383_.
Additional note: this seems to be a more pervasive issue than I had realized. I tried loading a sequence of files via readr::read_delim
. R ran out of memory despite the fact that (i) each file itself fits into memory and (ii) I loaded the files 1-by-1.
# readr: runs out-of-memory
for (f in fs) {
print(paste0("Loading ", f))
x <- readr::read_delim(file = f,
delim = " ",
skip = 2,
col_types = c("iii"))
rm(x); gc()
}
I repeated this experiment with data.table
's fread
function; everything works as expected.
# data.table: everything works
for (f in fs) {
print(paste0("Loading ", f))
x <- data.table::fread(file = f,
sep = " ",
colClasses = c("integer", "integer", "integer"),
skip = 2)
rm(x); gc()
}
As far as I can tell, the current version of readr
seems to suffer from more global memory leak issues, unfortunately.
I am having the same issue. The memory use increases almost monotonically even though the individual chunks are small.
Any updates or workarounds? Can I use edition 1 (via with_edition(1, ...)
or local_edition(1)
) to resolve this issue, at least for the time being?
Having the same problem here.
To investigate this issue we'll need a reprex, and some indication of how you're measuring R's memory consumption.
I am using the read_delim_chunked function to process large text files chunk-by-chunk. My expectation is that memory is cleared after each chunk is read. However, this does not seem to be the case. The amount of memory required to read the text file (by chunking) is the same as the amount of memory to read the text file (without chunking). I assume that this is a bug, but maybe my understanding of
read_delim_chunked
is incorrect. The purpose of reading by chunk is to conserve memory, right? Thanks!