phetsims / energy-skate-park-basics

"Energy Skate Park: Basics" is an educational simulation in HTML5, by PhET Interactive Simulations.
http://phet.colorado.edu/en/simulation/energy-skate-park-basics
GNU General Public License v3.0
2 stars 9 forks source link

Sim crashing on iPad 2 #435

Closed jessegreenberg closed 4 years ago

jessegreenberg commented 6 years ago

From #431 and https://github.com/phetsims/QA/issues/158, the sim is crashing frequently on iPad2. @KatieWoe found that the one off version (https://bayes.colorado.edu/dev/html/energy-skate-park-basics/1.4.0-trackCanvasInChrome.1/phet/energy-skate-park-basics_en_phet.html) was crashing, which could have been expected after rendering the tracks with canvas in Chrome for #431.

But then @KatieWoe noticed that the sim was crashing frequently in Safari as well, which is not expected because the change was chrome specific. We should see if another recent change to this sim could have caused this, or possibly other changes in scenery or common code.

jessegreenberg commented 6 years ago

Also, @lmulhall-phet and @KatieWoe confirmed that this was is happening in 1.4.0-dev.1, so this would have been introduced prior the date of that build.

samreid commented 6 years ago

Mac Chrome reports the following MB usage after starting the sim and putting the skater on the track on the 1st screen:

Published 1.1.7: 30MB Local built: 34.3MB Local requirejs: 38.4MB

ariel-phet commented 6 years ago

Unassigning @samreid as he has plenty of issues. @jessegreenberg will figure this out :cat:

jessegreenberg commented 6 years ago

Playing on an iPad2, I noticed that the sim runs very well in the Intro and Friction screens, but crashes quickly after starting to use the "Playground" screen.

EDIT: Maybe disregard this? Just hit a crash that suggests otherwise, maybe I haven't done enough testing.

jessegreenberg commented 6 years ago

Here are some notes from what I tried over the last couple days:

I tried to use a tool called Weinre to debug without tethering the iPad to a Mac. The tool worked great and I will use it a lot in the future. But I wasn't able to use it to get any information about this particular crash.

A while ago, I made a tool that prints sim sizes against all commits against the project. (https://github.com/phetsims/perennial/blob/master/bin/print-sim-sizes.sh) I modified this script to checkout the project at dates of commits in ESP:B, then build the sim and save the build to hopefully find where this problem was introduced. If the sim crashes, it crashes in about 15 seconds of navigating between the screens otherwise it doesn't crash at all. So is is easy to compare versions in this way. Doing this, I found that at 2017-08-18 10:30:09 there is no crashing, and 2017-08-25 21:30:18 the sim crashes frequently. So something changed in that window either in this sim or in the project to cause this.

jessegreenberg commented 6 years ago

Here are more notes from this:

A sim built with SHAs from 2017-08-17 at 22:41:19 does not crash. A sim built with SHAs at 2017-08-18 10:25:37 crashes almost as soon as the third screen button is pressed (sometimes before).

The commit at 2017-08-17 at 22:41:19 is dba01386dd884865c2111c0b5c63f31a4cdfbefb to Sherpa, with commit message

Adding jshashes-1.0.7 for https://github.com/phetsims/aqua/issues/22 (string hashing needs)

The commit at 2017-19-18 10:25:37 is f611b69e80756e4e10653ec59113d0f7f74eea83 with message

Removed stale comments, see https://github.com/phetsims/joist/issues/436

The chipper commit indeed only removed HTML comments, and would not have caused the break. I am going to check and see if my build at 2017-08-17 at 22:41:19 includes dba01386dd884865c2111c0b5c63f31a4cdfbefb. If not, that may indicate that jshashes-1.0.7 has something to do with this? I have no idea how, seems unlikely. I will also see if I missed any commits in this window.

jessegreenberg commented 6 years ago

I also noticed that this issue is very related to #343, which was also caused and fixed by memory related issues. The fix there didn't seem to reduce the JS heap.

I noted that its JS Heap is unchanged at 35.5MB...

jessegreenberg commented 6 years ago

Most times before the sim crashes I notice that the skater image disappears and the sim stalls for a second or two before we see the "Something went wrong..." message.

jessegreenberg commented 6 years ago

My build at 2017-08-17 at 22:41:19 did not include the right sherpa SHA, it has SHA 7be57a56484a08b829ad60a853431667cb163973, just before the one I was after.

jessegreenberg commented 6 years ago

Oops, looks like there git rev-list --before option can only get as specific as days, so I can't look into builds at exact timestamps. But no matter, this allowed me to find the day that some breaking change may have been introduced, and from the above comments it looks like 2017-08-18 is our day. I looked through commits on that day and noticed this one:

commit d84702b1549b497366034cbf270f28147feef3fb
Author: Jonathan Olson <olsonsjc@gmail.com>
Date:   Fri Aug 18 13:56:15 2017 -0600

    Double WebGL backing scale if there is no built-in antialiasing, see https://github.com/phetsims/circuit-construction-kit-dc/issues/139

The change doubled the "backing scale" for WebGL on some devices (including iPad2), and there are comments in https://github.com/phetsims/circuit-construction-kit-dc/issues/139 like

This makes the rectangles look beautiful, but it doubles the WebGL memory usage. Will this crash the sim earlier, or exacerbate #141? Possibly, or perhaps the graphics memory is separate from the JS heap memory and hence OK to go a bit higher.

We may wish to investigate the 2x backing scale and try to solve the memory issues.

jessegreenberg commented 6 years ago

If I remove this.backingScale *= 2; from WebGLBlock it definitely helps. With that line, the sim crashes 10/10 times when switching between screens. When I removed that line, the sim crashed 1/10 times on my iPad2.

jessegreenberg commented 6 years ago

I just tested again to see if the crash was a fluke after removing this.backingScale *= 2; from WebGLBlock, and I had exactly the same result, sim crashed 1/10 times. So that doesn't totally fix it.

samreid commented 6 years ago

Good discovery @jessegreenberg, the extra memory used for the WebGL canvas is a sensible culprit.

jessegreenberg commented 6 years ago

So that doesn't totally fix it.

I just remembered I was running a non-mangled version with Weinre, maybe those are the cause of the remaining crashing.

Regarding the comment about heap sizes (https://github.com/phetsims/energy-skate-park-basics/issues/435#issuecomment-409989428), I am not sure if that growth is a cause of this. For instance, when I run other sims like circuit-construction-kit-dc, the heap size grows beyond these values for me in Chrome.

jessegreenberg commented 5 years ago

In the above commit with https://github.com/phetsims/scenery/issues/859, I disabled backingScale as an antialiasing method while using mobile Safari. This helps quite a lot, but the sim still crashes infrequently on iPad2. I have only see it crash now when switching scenes.

jessegreenberg commented 5 years ago

I was searching around for other changes related to WebGL, I found this issue https://github.com/phetsims/scenery/issues/637. There is a comment about it making images more expensive, but reverting the changes had no impact on the crashing rate.

jessegreenberg commented 5 years ago

When I remove all of the nodes in the "WebGL Layer" in EnergySkateParkBasicsScreenView, I am unable to get the sim to crash, so it still seems related to WebGL.

@jonathanolson do you have any thoughts about why the sim might crash occasionally while changing screens? I noticed in the document that the "webgl-container" div doesn't exist until one of the screens is launched, and that each screen has a different canvas element for WebGL that gets added/removed from the document when screens are changed. Im sure there are very good reasons for this, but could it be related?

jessegreenberg commented 5 years ago

@jonathanolson and I met to investigate this this today. We started by assuming that the crashing was due to hitting the memory limit, so we began to profile. But got wildly differing memory results depending on things like browser, operating system and whether or not we were in a private window. So we were not able to pinpoint the problem with the profiling tools available to us.

I will just list out everything we tried and the memory usage values we observed from the profiling results. All comparisons were done against PhET branded versions of the sim.

MacOS Chrome task manager reported that the sim is using 160-180 MB while fuzzing. In a similar test of the deployed version, the task manager reported reported 138MB of use, then 104MB of use (presumably after a garbage collection). This indicated a substantial increase of ~60MB!

So we continued to investigate heap size, and observed 27.5MB for the published version, and 45.9MB for master just after start up. After fuzzing, these never got above 100MB so the Chrome process itself was taking up a lot of memory. But we also noticed in the memory tools that Chrome was reporting things from FractionsCommon and EqualityLabScreenView so it was counting objects from other tabs, so this test loses its value.

We took a look atphet.joist.display.getDebugHTML()`, and used that information to verify that the number of Nodes and Blocks looked OK. The deployed version has 897 in this report, while the new version has 919.

We tried using the Safari memory profiling tools, and Safari reported that the sim was using about 600 MB for "Page" content (https://webkit.org/blog/6425/memory-debugging-with-web-inspector/). Crazy huge amount! Then in re-tests, we observed values much lower, but still around 300MB. In similar tests we found that the deployed version also has ~300MB of "Page" content, so we think that the 600MB report was a red herring.

Finally we tried incognito mode and observed that the JS heap as reported by Chrome was ~20MB lower for both published and master versions. Looking through a comparison of heap snapshots between two versions didn't provide much information.

jessegreenberg commented 5 years ago

Also, @jonathanolson mentioned that @phet-steele was able to produce a crash report by tethering the iPad to a Mac and inspecting with XCode. @phet-steele would you be able to do this? Or if you don't have the platforms available could you please list the steps for how to do this? This could help verify whether the crashing is actually memory related.

I also may be able to see Safari crash reports after syncing my iPad with iTunes. https://help.getpocket.com/article/1098-how-to-find-the-iphone-ipad-app-crash-logs

jessegreenberg commented 5 years ago

I was able to get the crash log after connecting my iPad2 to itunes. At the time of the crash, there is a new JetsamEvent .ips file. It looks something like this:

{"os_version":"iPhone OS 9.3.5 (13G36)","bug_type":"298","timestamp":"2018-09-17 18:53:47.47 -0400"}
{
  "kernel" : "Darwin Kernel Version 15.6.0: Fri Aug 19 10:37:54 PDT 2016; root:xnu-3248.61.1~1\/RELEASE_ARM_S5L8942X",
  "date" : "2018-09-17 18:53:47.47 -0400",
  "crashReporterKey" : "a5ac2c1416e6bcdc2e9c5f8c26d9f1e37389c3c8",
  "product" : "iPad2,4",
  "build" : "iPhone OS 9.3.5 (13G36)",
  "incident" : "CF4FA097-C462-4769-BB0E-DD674F8A20AC",
  "memoryStatus" : {
  "pageSize" : 4096,
  "memoryPages" : {
    "fileBacked" : 11730,
    "anonymous" : 561,
    "inactive" : 3959,
    "active" : 8002,
    "wired" : 38438,
    "speculative" : 330,
    "throttled" : 75287,
    "purgeable" : 0,
    "free" : 1394
  },
  "compressions" : 28984,
  "decompressions" : 16690,
  "compressorSize" : 1502,
  "uncompressed" : 5062
},
  "largestProcess" : "com.apple.WebKit",
  "timeDelta" : 1696,
  "processes" : [
  {
    "pid" : 472,
    "reason" : "vm-pageshortage",
    "name" : "callservicesd",
    "fds" : 50,
    "lifetimeMax" : 1439,
    "rpages" : 444,
    "cpuTime" : 0.542582,
    "states" : [
      "daemon",
      "idle"
    ],
    "purgeable" : 0,
    "uuid" : "249eb6ce-20aa-30db-ab50-51a46d3a08b4"
  },
...

All processes have "reason" : "vm-pageshortage", which according to https://developer.apple.com/library/archive/technotes/tn2151/_index.html means that the process was killed due to "memory pressure".

samreid commented 5 years ago

I’m running a built version from today's master of Energy Skate Park: Basics on iPad2 on iOS 9.3.5. Over a minute with no crash.

UPDATE: running ?fuzzMouse on the same built version on the same iPad2, crashes at 98 seconds. UPDATE: 2nd run with ?fuzzMouse crashes at 70 seconds.

UPDATE: I built a new version with 5 screens: intro | friction | playground | playground | playground and fuzzing crashed at 4 seconds, 38 seconds, 24 seconds, 31 seconds.

UPDATE: built 5 screens fuzzing with no line dash (testing SVG memory concern). Crash at 40 seconds, 26 seconds.

UPDATE: Reduced the number of points in the track view, still crashed at 40 seconds.

UPDATE: With a max number of control points as 1000 and fuzzing disabled, the sim launches the homescreen, then crashes when trying to show the playground screen. 2nd run: it is showing the playground screeen OK but very sluggish.

UPDATE: Changing root renderer to canvas sometimes crashes within first few seconds of launching. Current session going 30+ seconds though.

UPDATE: Built version with 3 screens (no changes from master), but running with ?fuzzMouse&webgl=false crashes at 3:48.

jessegreenberg commented 5 years ago

I can confirm the results of https://github.com/phetsims/energy-skate-park-basics/issues/435#issuecomment-422424628 on my iPad2 as well.

jessegreenberg commented 5 years ago

I found that the sim crashed most consistently when changing screens. I built a version with 3 Playground screens, but added pickable: false to NavigationBar.js. The sim did crash with fuzzMouse, but it took 4 minutes.

UPDATE: 3 Playground screens with pickable: false removed from navigation bar crashed in 90 seconds.

samreid commented 5 years ago

Does the sim crash in the PhET iOS App the same way it crashes when running in Safari?

samreid commented 5 years ago

One strategy that may yield a useful breakdown of memory usage could be to embed the sim in a WKWebView and launch with XCode and Activity Monitor, like described here: https://stackoverflow.com/questions/36561063/get-current-memory-usage-of-wkwebview

Maybe that will open up the sim so we can see the memory usage? Not 100% sure whether this or something like this would be useful, but I thought I'd mention it just in case.

jessegreenberg commented 5 years ago

Thanks @samreid! That could definitely be a way to get more information.

jessegreenberg commented 5 years ago

I ran fuzzTest as a comparison with the deployed version of the sim and found that the sim is crashing on that version with fuzz testing as well. Three trials, it crashed at 105 seconds, 144 seconds, 120 seconds.

jessegreenberg commented 5 years ago

Discussed with @ariel-phet, we may not do any more investigation here as the sim is crashing very infrequently on iPad2 under normal usage. We will see if QA team can verify the low rate of crashing now, and we will move forward depending on what we learn in the next dev test.

jessegreenberg commented 4 years ago

From review comment asking about this in energy-skate-park, this flag can now be removed since we are not using WebGL and we are no longer targeting iPad2. I tested the sim for several minutes on an iPad 3 and saw no crashing or memory related failures.