r-lib / fastmap

Fast map implementation for R
https://r-lib.github.io/fastmap/
Other
133 stars 7 forks source link

Q: Can "the R symbol table [...] is never garbage-collected" be fixed? #7

Closed HenrikBengtsson closed 5 years ago

HenrikBengtsson commented 5 years ago

... that key is interned as a symbol and stored in the R symbol table, which is never garbage-collected.

Interesting - I wasn't aware of this. Just curious, do you think this could be fixed in R itself, or does that require a major, complicated rewrite? Please let me know if this has already been discussed elsewhere, e.g. R-devel.

wch commented 5 years ago

I had an in-person discussion with Luke Tierney about this. If I recall correctly, he said it would take a few solid weeks of work but that he wouldn't have time in the near future to do it.

Here's an example that directly demonstrates the leakage:

library(pryr)
mem_used()

for (i in 1:8) {
  for (j in 1:1e4) {
    as.symbol(as.character(rnorm(1)))
  }
  print(mem_used())
}

(Note that mem_used() calls gc().)

The output:

40.4 MB
42.7 MB
44.7 MB
46.7 MB
48.7 MB
50.7 MB
52.7 MB
54.6 MB
56.6 MB
HenrikBengtsson commented 5 years ago

Thanks for the info and good to hear that you talked to Luke about it. I hope it will be fixed one day, but I also understand that these things can take quite a while to get fixed.

dselivanov commented 5 years ago

Apart from leaking memory, environments don't have constant insertion time - time increases along the environment size. Which should not be the case for well designed hash map.