traversc / qs

Quick serialization of R objects
397 stars 19 forks source link

Extra memory usage when loading an object twice #85

Closed wlandau closed 1 year ago

wlandau commented 1 year ago

When I load the same object with qread() twice, htop seems to report extra memory usage.

packageVersion("qs")
#> [1] ‘0.25.5’

data <- data.frame(col1 = c("a", "b", "c"), col2 = c(1, 2, 3)) |>
  list() |>
  rep(x=_, 3e6)
tmp <- tempfile()
qs::qsave(data, tmp)

# RES 108M

loaded <- qs::qread(tmp) # 1.73 GB
gc()

# RES 1790M

loaded <- qs::qread(tmp)
gc()

# RES 3499M. Should this be around 1790M?
traversc commented 1 year ago

Try this out with plain R and compare. When I ran this, my htop started at 300M and ended at 3.55G.

data <- data.frame(col1 = c("a", "b", "c"), col2 = c(1, 2, 3)) |>
  list() |>
  rep(x=_, 3e6)
tmp <- tempfile()
saveRDS(data, tmp)

loaded <- readRDS(tmp)
gc()

loaded <- readRDS(tmp)
gc()
wlandau commented 1 year ago

Yup, I see a similar result on my machine. That's odd, I thought I explained https://github.com/ropensci/targets/discussions/1116.