phetsims / tasks

General tasks for PhET that will be tracked on Github
MIT License
5 stars 2 forks source link

investigate whether sims are leaking memory #546

Closed jbphet closed 8 years ago

jbphet commented 8 years ago

While preparing a release candidate for Gravity Force Labs, @aadish and I ran a test where we loaded the sim with the fuzzMouse=100 query param and periodically took heap snapshots using the Chrome developer tools. I expected to see the memory increase to some value and then stabilize, figuring that garbage collection would kick in and limit memory consumption. What we saw instead is that the memory kept increasing and, after several minutes, got to the point where trying to take a heap snapshot resulted in the dev tools disconnecting, requiring a reload of the sim to gather any more data. This was when the memory allocation was around 100MB, which seems like a lot.

The developers discussed this in the 3/17/2016 developer meeting, and we're not sure whether it means that sims are leaking memory, but we agreed that it should probably be investigated. I'm on the hook to start looking at this.

jbphet commented 8 years ago

During the developer meeting it was suggested that I profile using Hooke's Law 1.0, since @pixelzoom feels like that was tested for memory leaks before release. I profiled this sim by first loading it and taking a heap snapshot (all snapshots are being done using Chrome developer tools by the way), and then reloading it with fuzzMouse=100 set as a query parameter and taking snapshots every 30 seconds. The results showed that the memory does increase in this case. Here are the results:

Time (s) Size (MB)
0 25.1
30 39.6
60 52.3
90 61.2
120 72.0
150 81.1
180 91.9

Here's a screenshot of the developer tools at the end of this testing: hookes-law-heap-snapshots

For reference, the developer tools web site says that the number below the snapshot name is the “total size of the reachable JavaScript objects”, see https://developer.chrome.com/devtools/docs/heap-profiling.

From this, it would appear that memory allocation is indeed increasing.

jbphet commented 8 years ago

@samreid may have found a smoking gun. He ran these tests on the 'blast' sim, which doesn't have much going on, and found that memory was increasing there too. He commented out the phet-menu, and found that memory allocation no longer increased. Nice detective work!

jbphet commented 8 years ago

I just tested the current master version of Hooke's Law with a line of code added to PhetButton.js to make it unpickable so that the PhetMenu won't be created during fuzz testing. The memory allocation still goes up, but much more slowly than it did before, and it stays pretty close to 30MB total. Seems like PhetMenu may be responsible for the bulk of the issue. Here is a screenshot of the data from this test:

hookes-law-heap-snapshots-2

pixelzoom commented 8 years ago

AccessiblePeer looks suspect. It's constructor has @param {DOMElement} domElement, to which it adds listeners. It has no dispose function, so it never removes listeners. I don't know if that's significant or related to this, but AccessiblePeer is created by PhetMenu. And as a common-code component, AccessiblePeer should definitely have a dispose function.

Edit: AccessiblePeer does have a dispose function. But it doesn't call removeEventListener.

jbphet commented 8 years ago

It may be related to AccessiblePeer, but I think something more basic may be going on too. I just ran a test where I put a debugger statement in the code just before phetMenu.dispose() is called in PhetButton, and this line of code is never hit if I then click on the button and then click outside of the menu. If I do this several times, then take a heap snapshot and look at the count of PhetMenu objects, the number corresponds to the number of times I've clicked on the button. So it seems pretty clear that there is a path where PhetMenu is getting allocated and never disposed of.

I also tried the same test but clicking on items in the menu instead of outside of it. In this case, the dispose function is called, but the count of PhetMenu objects still continues to increase according to the profiler. Maybe this this is where AccessiblePeer comes into play.

Assigning to @samreid initially to have a look, since I believe he did a lot of this code originally. He can reassign as appropriate.

jbphet commented 8 years ago

As for the priority of this issue, it seems like this is unlikely to be a problem for our customers, since they'd have to open the menu a lot of times in order to run into any memory issue, but it does make it hard to test sims for memory leaks since this leak would likely overshadow any others.

ariel-phet commented 8 years ago

Marking as medium priority

Agreed with @jbphet that unlikely to be affecting users, but it does seem important to address to allows us to test for memory leaks.

ariel-phet commented 8 years ago

@samreid @jbphet should this issue also now be moved to joist?

jbphet commented 8 years ago

I was going to let @samreid make that determination. Seems like a joist issue to me, but @pixelzoom mentioned AcceissiblePeer, which is in Scenery, so I thought a little more investigation was in order before moving the issue.

jessegreenberg commented 8 years ago

AccessiblePeer should definitely have a completed dispose function, to be tracked in https://github.com/phetsims/scenery-phet/issues/226. But it is not a culprit for these memory leaks. AccessiblePeer is still hidden behind the accessibility query parameter and should never be touched in the tests mentioned in this issue.

samreid commented 8 years ago

I'll create a Joist issue for the AboutDialog memory, but let's leave this issue open until we test for other systemic issues.

jbphet commented 8 years ago

We discussed the importance of tracking down all the memory leaks during the 4/12/2016 developer meeting and decided that we should determine whether the sims actually exhibit problems in a memory constrained environment, such as an iPad 2. To test this, I fuzz tested (fuzzMouse=100) Acid Base Solutions v1.2.4 on an iPad 2 model A1395 with 16 GB of memory running iOS 9.3.1 (Witten). I ran two trials, and in both cases mobile Safari crashed (disappeared from the screen) at somewhere around the 25 minute mark. So I think the answer is probably yes - we should fix the leaks.

By the way, there is no special reason for choosing Acid Base Solutions. It's the first HTML5 sim in alphabetical order, and I profiled it using Chrome dev tools and found that it was leaking memory at around the same rate as the other sims. The version that I tested is one that is published as of this writing, and has none of the recent fixes to the framework that are meant to address the memory leaks.

samreid commented 8 years ago

We should test how long master of the same sim runs. Perhaps joist memory fixes have helped?

samreid commented 8 years ago

Assigning to @jbphet.

jbphet commented 8 years ago

I built a version of Acid Base Solutions off of master as of today and re-ran the fuzz test on it using the same iPad 2 referenced in one of the comments above. The results indicate that things are better, but the testing was ultimately a bit inconclusive because the home screen icons eventually scroll out of view due to https://github.com/phetsims/joist/issues/331, and as a result fuzz testing gets stuck on the home screen. This didn't happen until roughly 50 minutes of test time, so things are definitely better (since previously Safari would crash after about 25 minutes), but this test should be re-run once the issue with the moving home screen icons is fixed.

samreid commented 8 years ago

The results indicate that things are better, but the testing was ultimately a bit inconclusive because the home screen icons eventually scroll out of view due to phetsims/joist#331, and as a result fuzz testing gets stuck on the home screen

I put a bandaid on the joist 331 issue, so you should be able to proceed with testing here.

jbphet commented 8 years ago

Cool, thanks. I'll re-run the test and report the results today.

jbphet commented 8 years ago

I re-ran the test on a built version of Acid Base Solutions that incorporated both the fixes for the memory leaks in common code and the 'bandaid' fix for the drifting icons on the home screen. This test was still running after 20 hours, which is a huge improvement.

samreid commented 8 years ago

Given the rate of memory leak (as measured in Chrome Dev Tools) and the fact that it took 20 minutes to crash an iPad, you should be able to extrapolate from the new rate of memory leak and project the crash time for the fixed app:

time to crash the fixed app in minutes = memory leak rate of broken app / memory leak rate of fixed app * 20

For instance, if you estimate the broken app leak rate is 1MB/min and the fixed app leak rate is 0.1MB/min, then we could extrapolate that the fixed app would crash after 1/0.1*20 = 200 minutes.

samreid commented 8 years ago

I'm somewhat interested in the numbers above, so I'll take a look.

Running 1.2.4 with ?fuzzMouse=100 Initial Memory 34.5MB, after 30 seconds 59MB (average over two runs)

samreid commented 8 years ago

For a fresh build in master and fuzzMouse=100: Initial Memory 34.5MB, after 30 seconds 35.7MB

Hence the "fixed" app still has a slow leak that would crash the iPad after roughly 24.5/1.2*20=408 minutes, which is roughly 7 hours.

samreid commented 8 years ago

Oh wait, @jbphet reported that the fixed app was running over 20 hours, so there must be something else at play here.

jbphet commented 8 years ago

It's not safe to assume that if there is more memory allocated after 30 seconds there is a memory leak. My experience has been that the simulations continue to allocate memory for a while during fuzz testing, which makes sense, since there are new portions of the code being run that may allocate memory. The sim's memory allocation tends to stabilize after a few minutes once more of the code has been encountered during the testing. A better test would be to start it, let it run for two minutes, then take several memory snapshots at, say, 30 second intervals.

jbphet commented 8 years ago

Fuzz test on iPad 2 on Acid Base Solutions with mem leak fix and icon shift bandaid (details of test described above) still running after 24 hours with no discernible slowdown. I'm calling the test a success and shutting it down.

pixelzoom commented 8 years ago

still running after 24 hours with no discernible slowdown

I didn't realize that the performance also changed.

How was the memory profile after 24 hours? Leaking or not?

jbphet commented 8 years ago

I didn't realize that the performance also changed.

I entered the bit about "no discernible slowdown" because during the previous test - the one where mobile Safari crashed after ~25 minutes - the sim also looked a bit slower (@jonathanolson was in the room at the time and thought so too). I wanted to make it clear for the record that, as far as I could tell, the sim was behaving normally after 24 hours.

How was the memory profile after 24 hours? Leaking or not?

The test was run on an iPad, so I wasn't collecting memory profile data. It may be possible to obtain this by tethering to a Mac, but I didn't do so. The point of the test was to make sure that memory leakage isn't bad enough to crash the browser or cause other problems when running on the most memory limited device that we support.

jbphet commented 8 years ago

I re-ran the profiling test from https://github.com/phetsims/tasks/issues/546#issuecomment-198136011 above on the current master version of Hooke's Law. This was a built version. Below are the results, with the initial results included:

Time (s) Size (MB) Before Fixes Size (MB) After Fixes
0 25.1 24.4
30 39.6 25.2
60 52.3 25.6
90 61.2 25.7
120 72.0 25.9
150 81.1 26.1
180 91.9 26.1

If I assume that memory usage stabilizes at the 1 minute mark, the average leak rate before the changes was ( 91.9 - 52.3 ) / 120 = 330 kB/sec. The leak rate after the fixes was ( ( 26.1 - 25.6 ) / 120 = 4 kB/sec. There still appears to be a slow leak, but it is almost two orders of magnitude less than it was before the fixes.

I tried gathering data at the 10 minutes mark, but when this test was run on the pre-fixes version (v1.0.4), the dev tools crash when trying to take a heap snapshot. For the post-fixes version, the total memory size consumed at 10 minutes was 26.5 MB, which means it was leaking at around ( 26.5 - 26.1 ) / 420 = 952 bytes/sec from the 3 minute mark to the 10 minute mark.

jbphet commented 8 years ago

Marking for developer meeting to decide whether to close this, do a bit more investigation on a simple sim, or unassign.

samreid commented 8 years ago

My current recommendation (I reserve the right to change my mind after discussion):

As part of development of new sims or when old sims are brought up to standards, this code review will help us discover and address leaks in simulations and common code.

jbphet commented 8 years ago

Discussed in the 4/21/2016 developer meeting and we decided to add a screen to the "blast" sim and run this memory testing on that, then determine next steps based on the results. I will add the screen and will work with @phet-steele to gather the testing data.

jbphet commented 8 years ago

I've added the screen and published the new sim to spot, see http://www.colorado.edu/physics/phet/dev/html/blast/0.0.0-dev.5/blast_en.html. I've discussed the data gathering with @phet-steele and he's all over it. Assigning to him.

phet-steele commented 8 years ago

Win 10 Chrome:

Time (min) Size (MB)
0:00 16.5
0:30 17.2
1:00 17.5
1:30 17.9
2:00 18.2
2:30 17.9
3:00 18.1
3:30 18.3
4:00 18.0
4:30 18.1
5:00 18.4
5:30 18.5
6:00 18.2
6:30 18.6
7:00 18.5
7:30 18.5
8:00 18.6
8:30 18.6
9:00 18.3
9:30 18.6
10:00 18.3
phet-steele commented 8 years ago

I know everyone was yearning for a graphical representation: memorygraph

phet-steele commented 8 years ago

@jbphet did a snapshot at the time of this writing out of curiosity (18 minutes after the test) and it had climbed to 19.2 MB. So we can say ~28 minutes after startup the sim had not quite plateaued.

jbphet commented 8 years ago

I talked with @samreid about using re-running this test with some Chrome flags set that might help us to reduce the impact of memory allocations from the browser and thus distinguish them from memory allocations made by the sim code. We decided to try setting the v8-cache-options value to Disabled. These flags can be set in Chrome ussing chrome://flags in the address bar. Here are the results on the 2-screen Blast test:

Snapshot Time (min) Size (MB)
1 0:00 8.5
2 0:30 8.8
3 1:00 9.3
4 1:30 9.5
5 2:00 9.7
6 2:30 9.8
7 3:00 9.5
8 3:30 9.6
9 4:00 9.9
10 4:30 9.9
11 5:00 9.9
12 5:30 9.6
13 6:00 9.6
14 6:30 9.6
15 7:00 9.7
16 7:30 9.9
17 8:00 9.7
18 8:30 9.7
19 9:00 9.9
20 9:30 10.0
21 10:00 9.9

The total memory allocation is less than it was when the flag was set to Default, but the amount of variation between the snapshots is pretty similar.

jbphet commented 8 years ago

Here's a graph of the data in the previous comment: mem-usage-graph

jbphet commented 8 years ago

We discussed this issue in the 5/12/2016 developer meeting and decided that the current state of affairs is acceptable. The big memory leaks have been fixed, and we have several tests that demonstrate that there are no major leaks in most sims. While there is some "noise" in the measurement (probably due to the way the V8 engine works, but possibly just because of where the snapshots occur during the operation of the sim), this doesn't prevent us from determining whether significant memory leaks are occurring. Closing.