phetsims / unit-rates

"Unit Rates" is an educational simulation in HTML5, by PhET Interactive Simulations.
GNU General Public License v3.0
0 stars 2 forks source link

CT: above our memoryLimit #207

Closed KatieWoe closed 5 years ago

KatieWoe commented 5 years ago
unit-rates : fuzz : built : run
Uncaught Error: Average memory used (1016MB) is above our memoryLimit (1000MB). Current memory: 1709MB.
Error: Average memory used (1016MB) is above our memoryLimit (1000MB). Current memory: 1709MB.
    at t.value (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1831192)
    at e.listener (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1968678)
    at e.value (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:151135)
    at e.stepSimulation (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1979642)
    at e.stepOneFrame (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1979448)
    at e.runAnimationLoop (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1979308)
id: Bayes Chrome
Approximately 10/28/2018, 4:14:30 PM

unit-rates : fuzz : built : run
Uncaught Error: Average memory used (1026MB) is above our memoryLimit (1000MB). Current memory: 1783MB.
Error: Average memory used (1026MB) is above our memoryLimit (1000MB). Current memory: 1783MB.
    at t.value (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1831192)
    at e.listener (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1968678)
    at e.value (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:151135)
    at e.stepSimulation (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1979642)
    at e.stepOneFrame (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1979448)
    at e.runAnimationLoop (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1979308)
id: Bayes Chrome
Approximately 10/28/2018, 4:14:30 PM

unit-rates : fuzz : built : run
Uncaught Error: Average memory used (1001MB) is above our memoryLimit (1000MB). Current memory: 1703MB.
Error: Average memory used (1001MB) is above our memoryLimit (1000MB). Current memory: 1703MB.
    at t.value (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1831192)
    at e.listener (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1968678)
    at e.value (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:151135)
    at e.stepSimulation (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1979642)
    at e.stepOneFrame (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1979448)
    at e.runAnimationLoop (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/unit-rates/build/phet/unit-rates_en_phet.html?postMessageOnLoad&postMessageOnError&postMessageOnBeforeUnload&fuzz&memoryLimit=1000:1267:1979308)
id: Bayes Chrome
Approximately 10/28/2018, 4:14:30 PM

unit-rates : fuzz : require.js : run
Uncaught Error: Average memory used (1001MB) is above our memoryLimit (1000MB). Current memory: 1693MB.
Error: Average memory used (1001MB) is above our memoryLimit (1000MB). Current memory: 1693MB.
    at MemoryMonitor.measure (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/MemoryMonitor.js?bust=1540772386577:63:15)
    at Emitter.listener (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540772386577:232:30)
    at Emitter.emit (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/axon/js/Emitter.js?bust=1540772386577:187:53)
    at Sim.stepSimulation (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540772386577:915:34)
    at Sim.stepOneFrame (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540772386577:896:14)
    at Sim.runAnimationLoop (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540772386577:879:14)
id: Bayes Chrome
Approximately 10/28/2018, 4:14:30 PM

unit-rates : fuzz : require.js-canvas : run
Uncaught Error: Average memory used (1001MB) is above our memoryLimit (1000MB). Current memory: 1810MB.
Error: Average memory used (1001MB) is above our memoryLimit (1000MB). Current memory: 1810MB.
    at MemoryMonitor.measure (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/MemoryMonitor.js?bust=1540799275115:63:15)
    at Emitter.listener (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540799275115:232:30)
    at Emitter.emit (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/axon/js/Emitter.js?bust=1540799275115:187:53)
    at Sim.stepSimulation (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540799275115:915:34)
    at Sim.stepOneFrame (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540799275115:896:14)
    at Sim.runAnimationLoop (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540799275115:879:14)
id: Bayes Chrome
Approximately 10/28/2018, 4:14:30 PM

unit-rates : fuzz : require.js-canvas : run
Uncaught Error: Average memory used (1078MB) is above our memoryLimit (1000MB). Current memory: 1850MB.
Error: Average memory used (1078MB) is above our memoryLimit (1000MB). Current memory: 1850MB.
    at MemoryMonitor.measure (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/MemoryMonitor.js?bust=1540814119581:63:15)
    at Emitter.listener (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540814119581:232:30)
    at Emitter.emit (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/axon/js/Emitter.js?bust=1540814119581:187:53)
    at Sim.stepSimulation (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540814119581:915:34)
    at Sim.stepOneFrame (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540814119581:896:14)
    at Sim.runAnimationLoop (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540814119581:879:14)
id: Bayes Chrome
Approximately 10/28/2018, 4:14:30 PM

unit-rates : xss-fuzz : run
Uncaught Error: Average memory used (1001MB) is above our memoryLimit (1000MB). Current memory: 1842MB.
Error: Average memory used (1001MB) is above our memoryLimit (1000MB). Current memory: 1842MB.
    at MemoryMonitor.measure (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/MemoryMonitor.js?bust=1540792615422:63:15)
    at Emitter.listener (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540792615422:232:30)
    at Emitter.emit (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/axon/js/Emitter.js?bust=1540792615422:187:53)
    at Sim.stepSimulation (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540792615422:915:34)
    at Sim.stepOneFrame (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540792615422:896:14)
    at Sim.runAnimationLoop (https://bayes.colorado.edu/continuous-testing/snapshot-1540764870866/joist/js/Sim.js?bust=1540792615422:879:14)
id: Bayes Chrome
Approximately 10/28/2018, 4:14:30 PM
pixelzoom commented 5 years ago

Prior to publishing, this sim had extensive memory leak testing on 3/6/17, see https://github.com/phetsims/unit-rates/issues/170, and the heap size never grew past 55MB.

I looked through all commits since 3/6/17. The only significant change in master since then was to convert from TWEEN to Animation. So I guess I'll start there.

pixelzoom commented 5 years ago

Commits related to conversion from TWIXT to Animation, in chronological order:

8/24/18 https://github.com/phetsims/unit-rates/commit/eb9598a927712c6d276e83c754470cad5649570d 8/24/18 https://github.com/phetsims/unit-rates/commit/a5deac3684e42b84596ba95524af8066d8c4cc58 10/27/18 https://github.com/phetsims/unit-rates/commit/a294a7fb25f39715ab9fcc9b2c71ad4174f838b9

pixelzoom commented 5 years ago

According to @jonathanolson, CT fuzz test for requirejs is 120 seconds with query parameters: ?brand=phet&ea&fuzz&memoryLimit=1000.

So if I run for > 2 minutes with ?brand=phet&ea&fuzz, then I should see a heap snapshot > 1000MB.

pixelzoom commented 5 years ago

With Chrome 70.0.3538.77 on macOS 10.11.6, running in requirejs mode with ?brand=phet&ea&fuzz for 3 minutes, the heap size was a whopping 1411 MB. It took over 30 minutes to build the snapshot. And meanwhile, the sim "paused before potential out-of-memory crash".

pixelzoom commented 5 years ago

I don't see anything in the 1411 MB snapshot that looks related to Animation. But I do see a huge number of Emitter instances, apparently changeEmitter in Color.

I took another snapshot at startup, 76MB. (That's bigger than the 55MB size after 25 minutes of runtime in the original memory tests, see #170). A second snapshot 10 seconds later was 123MB. Comparing snapshots, I again see a lot of Emitter instances.

@jonathanolson is it possible that https://github.com/phetsims/scenery/issues/879 is not resolved?

pixelzoom commented 5 years ago

This issue most definitely blocks publication of this (and possibly other) sims.

jonathanolson commented 5 years ago

Hmm, this looks related to https://github.com/phetsims/sun/issues/362.

At least one case in unit-rates is:

I'd like to tag for the developer meeting, because any of the ways of solving this generally have consequences:

Long-term, doing both things sounds correct, but I'm curious what others think.

jonathanolson commented 5 years ago

I've had two CT browser tabs crash as part of https://github.com/phetsims/aqua/issues/54 while testing this. I'm bumping priority to high so we ensure it is talked about in dev meeting.

jonathanolson commented 5 years ago

I'll create an issue for immutable colors.

pixelzoom commented 5 years ago

11/1/18 dev meeting notes:

pixelzoom commented 5 years ago

This isn't isolated to sim-specific code. I'll need to add dispose to SCENERY_PHET/NumberKeypad too.

pixelzoom commented 5 years ago

In the above commit, I added dispose calls for all buttons, and a call to disposeSubtree for SCENERY/NumberKeypad. Tested with Chrome 70.0.3538.77 on macOS 10.11.6, running in requirejs mode with ?brand=phet&ea&fuzz.

The good news is that I can now take a heap snapshot without waiting 20 minutes, and CT shouldn't be complaining. Startup heap size was 86MB. After 1 minute, heaps size was 126MB. After 5 minutes, heap size was 98MB, which seems to indicate some GC. After 10 minutes, heap size was back up to 133MB.

The bad news is that these heap sizes are still much larger than expected. In prior memory leak testing (#170), startup heap size was 45MB, and the sim stabilized at ~52MB after 25 minutes. We are nowhere near those numbers.

Also worth noting is that the sim becomes sluggish/jerky almost immediately during fuzz testing, behavior that's indicative of a leak.

@jonathanolson any ideas?

pixelzoom commented 5 years ago

Note to self: Each line of code that I added as a workaround for this issue has a TODO next to it.

pixelzoom commented 5 years ago

Adding dev meeting label to discuss timeframe for addressing this. Since we have a number of sims being published with new release branches (graphing-quadratics, graphing-lines, resistance-in-a-wire,...) it would be good to get to the bottom of this.

jonathanolson commented 5 years ago

I made the change for color listeners, so I'll retest for memory leaks right now.

jonathanolson commented 5 years ago

2 minutes of fuzzing and the sim is still <100MB, so the "major" memory leak was resolved by changes made.

pixelzoom commented 5 years ago

@jonathanolson said:

2 minutes of fuzzing and the sim is still <100MB ...

@jonathanolson At what heap size value did it stabilize? As noted above, this sim previously stabilized at ~55MB and never grew larger after 25 minutes of testing. So < 100 after only 2 minutes is not what I'd consider conclusive.

jonathanolson commented 5 years ago

I need to close the testing for now, but here's what I found:

3:07pm: ~85MB (shortly after launching) 3:18pm: ~144MB 3:40pm: ~112MB 4:21pm: ~110MB 5:36pm: ~135MB

The amount of a11y/phet-io and internal properties (for things like color tracking or other things) has definitely taken a hit on the memory usage, so I would expect it unfortunately to be higher than we had previously.

pixelzoom commented 5 years ago

The amount of a11y/phet-io and internal properties (for things like color tracking or other things) has definitely taken a hit on the memory usage, so I would expect it unfortunately to be higher than we had previously.

The increase in this case is > 100%. If a11y/phet-io and internal properties are responsible for that increase (which I doubt), then we have a serous problem with how those things have been implemented.

jonathanolson commented 5 years ago

If a11y/phet-io and internal properties are responsible for that increase (which I doubt)

I'll build compare the 1.0 and master branches to see if I can identify (based on heap snapshots) the major differences.

pixelzoom commented 5 years ago

@jonathanolson Can I remove the disposal of buttons that I added in https://github.com/phetsims/unit-rates/commit/5a8bf05533e25f13c1b6082c97bb853fede1a699?

jonathanolson commented 5 years ago

I think that disposal might be good long-term.

Chart of initial memory usage by date (in master from approximate maintenance release to now)

screen shot 2018-11-08 at 1 37 32 pm

I'm zeroing in on the exact series of commits that triggered the issues.

samreid commented 5 years ago

The preceding commit reduces requirejs phet-brand Unit Rates memory from 77.2MB to 55.4MB.

samreid commented 5 years ago

The preceding commit reduces requirejs phet-brand Unit Rates memory from 52.4MB to 43.6MB.

samreid commented 5 years ago

After above commits, we are down to 43.0MB.

samreid commented 5 years ago

Down to 42.6 MB

pixelzoom commented 5 years ago

@samreid looks like the source of the memory increase was identified. Can you summarize here in a comment, for posterity?

samreid commented 5 years ago

The changes I made were predominantly about factoring out instantiated parametric types. For instance, https://github.com/phetsims/axon/commit/4a9762e882ea9862af96fdf0e736754e6e6c35d8 factors out EmitterIO( [] ) instead of creating an equivalent but different one for each Emitter. I cannot say for certain that this was "the source of the memory increase", but it seemed like a safe and straightforward way to significantly reduce the memory footprint. We can probably reduce size further by eliminating closures in common code, or by other strategies.

zepumph commented 5 years ago

For clarity, @jonathanolson and I (and mainly @jonathanolson) worked out that the source of the memory leak came from the commit when I moved phetioInherit from the phet-io repo into tandem. This consequently made it so that phetioInherit was not just "stubbing" out TypeIOs and returning no-op functions like function(){}, but instead it was actually creating TypeIOs. This exposed the memory leak that @samreid explained above to all sims, even in phet brand.

Re his comment: I think that there is likely other places where we can decrease memory consumption, seeing as 150% is still a large size increase to all sims, and likely that is from things like default parameters that @samreid was able to factor out some.

zepumph commented 5 years ago

In the above commits, I found that unit rates phet brand went down another .3 MB for DragListener.draggedEmitter and not much for Checkbox.toggledEmitter but that likely effected other sims (with a lot of checkboxes maybe?).

I then looked at all usages of EmitterIO( and didn't see any others that looked like they were being used systemically. I'm now a bit more unsure about where the other MB have come from.

pixelzoom commented 5 years ago

Somehow we went from @jonathanolson saying:

I'm zeroing in on the exact series of commits that triggered the issues.

To this email from @zepumph:

To be sure everyone is on the same page:

We know why and how this happened. Since moving phetioInherit out of phet-io ( support https://github.com/phetsims/axon/issues/190), it is no longer hidden behind the ifphetio! plugin. Therefore each Type, even simple value types like Vector2, gained a fully separate TypeIO object and reference to it in any brand. So now each Vector2 has a Vector2IO that extends ObjectiO, whereas before each Vector2 instance would have had a reference to function(){} (the stub returned from the ifphetio! plugin).

Now with the question of where to go from here. . . A few thoughts:

  • Eventually unit-rates will use this much memory because it will be instrumented for phet-io, and run in phet-io.
  • In today's dev meeting, we just talked about aligning a piece of phet-io with phet pretty heavily, in that all value validation will be done through phetioType. Though that isn't set in stone, to me it sets some precedent of tolerance towards adding this memory to phet brand.
  • This is one step closer to phet-io and phet branded sims being the same code. Perhaps one way to think about this is to strive for a "single sim" rather than two different ones depending on brand.
  • If we decide that the increased memory in phet brand is unacceptable, then we may need to rethink multiple decisions about phet-io, and integrating pieces into the main code base rather than keeping things separate.

I asked:

@samreid looks like the source of the memory increase was identified. Can you summarize here in a comment, for posterity?

Then @samreid described specific commits, which was not what I'm looking for

So to clarify.... What was the general problem, and what is the general solution that you're pursuing?

Also highly recommended to create a general issue somewhere to be dealing with this, not deal with it in this sim-specific issue.

pixelzoom commented 5 years ago

And yes, I've read https://github.com/phetsims/unit-rates/issues/207#issuecomment-437248201 and https://github.com/phetsims/unit-rates/issues/207#issuecomment-437252582, but I don't understand.

samreid commented 5 years ago

And yes, I've read #207 (comment) and #207 (comment), but I don't understand.

Perhaps a short call would be very helpful? I'll be on slack.

pixelzoom commented 5 years ago

@samreid and I discussed on Slack.

The cause of this issue is now moved to https://github.com/phetsims/tandem/issues/71. Please comment and commit there.

pixelzoom commented 5 years ago

I tested and got startup heap size of 45MB on Chrome + macOS, so that correlates with what @samreid is seeing. Closing this issue. Further memory improvements related to IO types will continue in phetsims/tandem#71.