scala / bug

Scala 2 bug reports only. Please, no questions — proper bug reports only.
https://scala-lang.org
230 stars 21 forks source link

Garbage collecting REPL bindings... #4331

Open scabug opened 13 years ago

scabug commented 13 years ago

= problem = A lot of values that I compute are huge, and when done in the REPL, a binding similar to:

val res3 =

is introduced by Scala. After a few of these bindings I run out of memory, and have to start the REPL afresh.

Workarounds like always binding a var at the REPL and clearing it, or wrapping all REPL values with SoftReference are too annoying.

= analysis =

= enhancement recommendation = Always bind top level values to some fixed identifier like "it", etc.

scabug commented 13 years ago

Imported From: https://issues.scala-lang.org/browse/SI-4331?orig=1 Reporter: apollo See #7787

scabug commented 13 years ago

@paulp said: I thought I'd implemented this, but when it didn't work I found it's harder than I thought, because MODULE$$ acts as a gc root on each line. I just have to change a few more things than I thought so it'll be a little bit.

scabug commented 12 years ago

@retronym said: Just checked this after a REPL :reset, same story.

byte[33554432] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...}
 a of $line6.$read$$iw$$iw$
  MODULE$ of $line6.$read$$iw$$iw$
   [15] of java.lang.Object[40]
    elementData of java.util.Vector
     classes of scala.tools.nsc.interpreter.IMain$TranslatingClassLoader
      classLoader of scala.reflect.runtime.Mirror
scabug commented 12 years ago

@paulp said: Yeah, nothing short of nulling them out will get them back as things presently stand - they are statically reachable.

scabug commented 8 years ago

@SethTisue said: Nulling them out at :reset time using reflection is certainly ugly, but probably better than doing nothing? Is there a downside? Wondering if a PR with such a change would be accepted.

scabug commented 8 years ago

@retronym said (edited on Nov 11, 2015 4:36:38 AM UTC): Turns out supporting GC of bindings in the :reset command is much easier, we just need to make sure the classloader we're discarding is not retained as the context classloader of any thread or in a field of a runtime reflection mirror that we're storing in the REPL.

https://github.com/scala/scala/pull/4841

Fixing that perhaps doesn't go far enough to provide a useful answer for the enhancement request, however. From my commit comment:

    While this commit makes :reset perform its advertised function,
    it doesn't go further to making it a particulary useful command.
    Ideally we'd offer a mechanism to transport some data across a :reset
    boundary. The user could do something like this manually by stashing
    data in a `java.util.Map` hosted in some class in the same classloader
    as IMain. They would also have to restrict this to sending "pure data"
    rather than instances of data types defined in the REPL session
    itself.
scabug commented 8 years ago

@som-snytt said: I was going to comment that the classloader is discarded on reset.

The PR for ScriptEngine uses Dynamic selection in the template; maybe there's a convenient syntax for doing that from a REPL session.

Alternatively, more general switching of templates would allow the kinds of wrappings described (vars or softrefs or other holders, etc). There is or was a mechanism for wrapping every computation, IIRC.

runzhiwang commented 4 years ago

@SethTisue Hi, is there any solution for this issue? The memory leak is a big problem in our situation.

SethTisue commented 4 years ago

I don't know of any solution or workaround.

I wonder, though, if anyone has investigated whether the situation is any different under -Yrepl-class-based.

som-snytt commented 4 years ago

Retronym's reset workaround saga is at https://github.com/scala/scala/pull/5657

$ ~/scala-2.12.3/bin/scala -J-XX:+HeapDumpOnOutOfMemoryError -J-Xmx512M
Welcome to Scala 2.12.3 (OpenJDK 64-Bit Server VM, Java 11.0.3).
Type in expressions for evaluation. Or try :help.

scala> 42
res0: Int = 42

scala> {val b = new Array[Byte](256 * 1024* 1024); () => b}
res1: () => Array[Byte] = $$Lambda$1082/0x00000001006fe440@78116659

scala> {val b = new Array[Byte](256 * 1024* 1024); () => b}
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid30320.hprof ...
Heap dump file created [310693587 bytes in 0.558 secs]
java.lang.OutOfMemoryError: Java heap space
  ... 30 elided

scala> :reset
Resetting interpreter state.
Forgetting this session history:

42
{val b = new Array[Byte](256 * 1024* 1024); () => b}

Forgetting all expression results and named terms: $intp

scala> {val b = new Array[Byte](256 * 1024* 1024); () => b}
java.lang.OutOfMemoryError: Java heap space

but since 2.12.4

$ skala -J-XX:+HeapDumpOnOutOfMemoryError -J-Xmx512M
Welcome to Scala 2.12.10-20190905-182221-e67ab6d (OpenJDK 64-Bit Server VM, Java 11.0.3).
Type in expressions for evaluation. Or try :help.

scala> 42
res0: Int = 42

scala> {val b = new Array[Byte](256 * 1024* 1024); () => b}
res1: () => Array[Byte] = $$Lambda$1882/0x000000010088f840@4351ed61

scala> {val b = new Array[Byte](256 * 1024* 1024); () => b}
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid30463.hprof ...
Heap dump file created [304537356 bytes in 0.493 secs]
java.lang.OutOfMemoryError: Java heap space
  ... 29 elided

scala> :reset
Resetting interpreter state.
Forgetting this session history:

42
{val b = new Array[Byte](256 * 1024* 1024); () => b}

Forgetting all expression results and named terms: $intp

scala> {val b = new Array[Byte](256 * 1024* 1024); () => b}
res0: () => Array[Byte] = $$Lambda$1921/0x000000010088f840@14590fe2

The class loader fix was forward-ported, but it looks like the repl refactor is broken (with or without -Yrepl-class-based:

$ scala -J-XX:+HeapDumpOnOutOfMemoryError -J-Xmx512M
Welcome to Scala 2.13.0 (OpenJDK 64-Bit Server VM, Java 11.0.3).
Type in expressions for evaluation. Or try :help.

scala> 42
res0: Int = 42

scala> {val b = new Array[Byte](256 * 1024* 1024); () => b}
res1: () => Array[Byte] = $$Lambda$857/0x00000001005d3440@5d5a51b1

scala> {val b = new Array[Byte](256 * 1024* 1024); () => b}
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid30550.hprof ...
Heap dump file created [303269903 bytes in 0.460 secs]
java.lang.OutOfMemoryError: Java heap space
  ... 29 elided

scala> :reset
Resetting interpreter state.
Forgetting this session history:

42
{val b = new Array[Byte](256 * 1024* 1024); () => b}

Forgetting all expression results and named terms: $intp

scala> {val b = new Array[Byte](256 * 1024* 1024); () => b}
java.lang.OutOfMemoryError: Java heap space
runzhiwang commented 4 years ago

@SethTisue Hi, -Yrepl-class-based does not make any different.

runzhiwang commented 4 years ago

@som-snytt If the variable b was assigned two times by new Array[Byte](256 1024 1024), why not gc the first object: new Array[Byte](256 1024 1024) to free memory?

som-snytt commented 4 years ago

@runzhiwang Indeed. But it could be captured by a subsequent definition def f and not be available for collection yet. Some people might expect that f to "pick up" the new definition. The Scripted sticks bound variables in a map, and any definition using the variable looks it up dynamically and will see the new value.

Currently, the old value is a member of a statically reachable object, so there is no ordinary mechanism to make it collectable.

The quicker fix suggested by the OP would be for the REPL to wrap values in SoftRefs and also accesses in a dereferencing get. If a value has been collected, the REPL could recompute it.