Closed pixelzoom closed 5 years ago
@ariel-phet said we can ignore iOS 9 and Win10 + IE.
@KatieWoe can you describe the behavior on Win10 + Chrome? Does it feel jerky? What specifically is the test device? Is it an older or newer model computer?
@arouinfar can you also evaluate on @KatieWoe's test device?
It seemed to be a bit jerkier than the other platforms, but it didn't strike me as being too bad. I probably wouldn't have registered it as a problem if I weren't specifically looking for it. It is my work laptop, so it's relatively new.
@pixelzoom @KatieWoe I tested dev.39 and dev.40 out on a brand new gaming PC running Win 10 with pretty beefy specs (AMD 3700X CPU, AMD Radeon 5700XT GPU, 16 GB 3200 MHz RAM) and I am seeing a similar pattern, as described by @KatieWoe. Chrome was noticeably jerkier than Edge, but I wouldn't call the performance unacceptable.
Test case
Edge never budges from 60 FPS. Chrome varied a bit, but hovered around 35-40 FPS. I'm seeing a huge spike in GPU utilization when running the sim in Chrome (up to 90%) compared to Edge (never more than 15%), but no difference in CPU or RAM usage. Screenshots are from dev.39, but the behavior was the same in dev.40.
Troubleshooting information (do not edit): Name: Gas Properties URL: https://phet-dev.colorado.edu/html/gas-properties/1.0.0-dev.39/phet/gas-properties_en_phet.html?profiler Version: 1.0.0-dev.39 2019-06-17 03:09:43 UTC Features missing: touch Flags: pixelRatioScaling User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36 Language: en-US Window: 1711x1294 Pixel Ratio: 1.5/1 WebGL: WebGL 1.0 (OpenGL ES 2.0 Chromium) GLSL: WebGL GLSL ES 1.0 (OpenGL ES GLSL ES 1.0 Chromium) Vendor: WebKit (WebKit WebGL) Vertex: attribs: 16 varying: 30 uniform: 4096 Texture: size: 16384 imageUnits: 16 (vertex: 16, combined: 32) Max viewport: 16384x16384 OES_texture_float: true Dependencies JSON: {}
Troubleshooting information (do not edit): Name: Gas Properties URL: https://phet-dev.colorado.edu/html/gas-properties/1.0.0-dev.39/phet/gas-properties_en_phet.html?profiler Version: 1.0.0-dev.39 2019-06-17 03:09:43 UTC Features missing: touch Flags: pixelRatioScaling User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.18362 Language: en-US Window: 1809x1266 Pixel Ratio: 1.5/1 WebGL: WebGL 1.0 GLSL: WebGL GLSL ES 1.0 Vendor: Microsoft (Microsoft Edge) Vertex: attribs: 16 varying: 30 uniform: 4096 Texture: size: 16384 imageUnits: 16 (vertex: 16, combined: 32) Max viewport: 32767x32767 OES_texture_float: true Dependencies JSON: {}
Versions <= 1.0.0-dev.40 did not have the performance improvement that was make to repo kite to address Canvas performance issues in pH Scale (see https://github.com/phetsims/ph-scale/issues/83). So I'd like to see if that improvement resulted in any improvement to Gas Properties, since it also uses Canvas for drawing particles.
@KatieWoe please test 1.0.0-dev.43?profiler on Win10 + Chrome. Test on your personal machine and on the "brand new gaming PC" mentioned above. Let me know what the frame rate is. Thanks.
I can't really test the gaming pc since I believe that belongs to @arouinfar. I will test mine and @arouinfar, would you be willing to do so on your machine?
The earliest I'll be able to test is Friday. If it's been resolved on @KatieWoe's machine, then that's good enough for me.
@pixelzoom still seems to be about 35-36
OK, thanks @KatieWoe.
@arouinfar @ariel-phet Do we need to address this issue for 1.0.0 release? I have no immediate ideas for improving performance, and I don't have a Win10+Chrome test machine handy (would need to borrow one from PhET). This is also likely to require some of @jonathanolson's time.
@pixelzoom I personally do not think this needs to be addressed before the 1.0.0 release. While the performance is smoother on other browsers, the simulation still performs reasonably on Win10/Chrome. Since @ariel-phet is a Win10 user, I think it would be good for him to test it out on his machine and see how it feels. I'll defer final judgement to him.
@pixelzoom @arouinfar tested on my machine (Win 10 Chrome), and I agree performance is acceptable.
That being said, my machine did seem to be getting pretty darn hot...and...
Edge will be moving to a "chromium" based browser. This is also by far one of our most popular sims, so it at least seems worthy of some investigation and some of @jonathanolson time.
We do not necessarily need to fix, but while the sim is still fresh in people's minds and is getting some attention, it seems worthy of a bit of investigation. Especially since chrome is by far our most popular browser. @pixelzoom I think you could also ask an on campus dev to do some quick profiling. @chrisklus might be open to it.
Marking as medium, since as mentioned above, I don't think this blocks publication, merely worth of investigation.
Performance issues can typically involve substantial rewriting or reorganization of code. So while I appreciate that it's not blocking publication, I'd like to understand this issue a little bit (at a minimum, identify where the time is being spent) before moving into RC testing.
I tested this link https://phet-dev.colorado.edu/html/gas-properties/1.0.0-dev.43/phet/gas-properties_en_phet.html?profiler on my Chrome + Window 10 device and got in the high 40s low 50s, hovering right around 50 average at load. I do see a GPU spike.
Test device: Kathy's computer Operating System: MS Windows Enterprise 10, Version 10.0.17763 Build 17763 Browser: Chrome Troubleshooting information (do not edit): Name: Gas Properties URL: https://phet-dev.colorado.edu/html/gas-properties/1.0.0-dev.43/phet/gas-properties_en_phet.html?profiler Version: 1.0.0-dev.43 2019-07-22 18:36:42 UTC Features missing: generatedcontent, touch Flags: pixelRatioScaling User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36 Language: en-US Window: 1138x706 Pixel Ratio: 1.25/1 WebGL: WebGL 1.0 (OpenGL ES 2.0 Chromium) GLSL: WebGL GLSL ES 1.0 (OpenGL ES GLSL ES 1.0 Chromium) Vendor: WebKit (WebKit WebGL) Vertex: attribs: 16 varying: 30 uniform: 4096 Texture: size: 16384 imageUnits: 16 (vertex: 16, combined: 32) Max viewport: 16384x16384 OES_texture_float: true Dependencies JSON: {}
@jonathanolson this is the issue mentioned at 7/25/19 status meeting that needs investigation. Thanks in advance for your help.
I'm curious if it's related at all to an AMD GPU.
My Windows 10 specs are probably a bit older, but still fairly beefy, and the sim refuses to run below 60fps for me. AMD Ryzen 7 1800x (3.8 GHz), nVidia 1080ti, also 16 GB 3200 MHz RAM).
CPU profiling in devtools indicates that a decent chunk of time is spent on Canvas operations:
chrome:tracing shows not a significant GPU presence, but definitely shows additional time spent flushing Canvas:
I'll discuss more with @pixelzoom.
@jonathanolson asked:
I'm curious if it's related at all to an AMD GPU.
@KatieWoe does your Win10 machine have an AMD GPU?
No, Intel HD Graphics 530
Discussed with @jonathanolson and @ariel-phet. @jonathanolson is going to make a general scenery Node subclass that is optimized for rendering large numbers of similar images. It will use WebGL as its primary renderer, with Canvas fallback. @jonathanolson will create a scenery issue for this. He'll integrate it into Gas Properties (in ParticlesNodes
) behind a query parameter. We'll evaluate next Monday, and decide whether to include in 1.0.0.
In the meantime, I've made some straightforward optimizations to for
loops throughout the sim, but specifically for critical code that handles particle arrays. I'm iterating in reverse, so that the loop's exit condition is a constant, not reevaluated on every iteration. I've also factored out some constants from the for
loop in ParticlesNode.drawParticles
.
1.0.0-dev.44 contains my straightforward optimizations. @KatieWoe and @ariel-phet I'd be interested in knowing if this improved the frame rate on your Win10+Chrome configurations.
@pixelzoom - On mine the framerate now toggles in the high 50s, sometimes hitting 60 under the same conditions as above. and the GPU still goes up but is in the low 70s instead of 80%.
Thanks @kathy-phet.
@pixelzoom @kathy-phet @ariel-phet I tested dev.44 on our gaming pc. We ended up returning the AMD GPU and getting one from Nvidia instead, so it's not exactly an apples-to-apples comparison. That said, the performance in Chrome has remarkably improved. The frame rate never dips below 55 fps and GPU utilization stays below 30%.
Troubleshooting information (do not edit): Name: Gas Properties URL: https://phet-dev.colorado.edu/html/gas-properties/1.0.0-dev.44/phet/gas-properties_en_phet.html?profiler Version: 1.0.0-dev.44 2019-07-25 22:22:49 UTC Features missing: touch Flags: pixelRatioScaling User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36 Language: en-US Window: 1400x1290 Pixel Ratio: 1.5/1 WebGL: WebGL 1.0 (OpenGL ES 2.0 Chromium) GLSL: WebGL GLSL ES 1.0 (OpenGL ES GLSL ES 1.0 Chromium) Vendor: WebKit (WebKit WebGL) Vertex: attribs: 16 varying: 30 uniform: 4095 Texture: size: 16384 imageUnits: 16 (vertex: 16, combined: 32) Max viewport: 16384x16384 OES_texture_float: true Dependencies JSON: {}
Still hovers in the mid thirties for me.
Katie - do you have other things running? I noticed a decrease if I had other things running - like the sim in both firefox and chrome running at the same time.
I had various github and email/slack tabs open.
I just checked with everything but the sim closed. Same thing.
I retested dev.39 and dev.40 now that we've switched to a Nvidia GPU (which I should have done last night -- but I was a bit loopy from the travel). With the Nvidia card the frame rate is about 50 fps, and dips down to 44 when blowing the lid. GPU utilization was ~50%. There is definitely still improvement from dev.39/40 to dev.44 (~5 fps higher, 20-30% less GPU utilization) but it's not as dramatic as I was initially thinking. Perhaps that's why @KatieWoe isn't noticing much difference in frame rate?
@jonathanolson reported that he "ran into design and technical complications", so does not have the solution we discussed in https://github.com/phetsims/gas-properties/issues/146#issuecomment-515237667.
So we're going to defer this performance improvement (and this issue) until a future release. Approved by @arouinfar and @ariel-phet via Slack.
I've come up with a functional implementation for this that I have in a branch (relevant issue is https://github.com/phetsims/scenery/issues/990 in Scenery).
Would it be possible for someone (who ran into the Windows performance issue) to compare the performance on Win10 Chrome on the following two versions?:
The first version should be using WebGL for the sprites.
@KatieWoe is in the best position to test, since her personal machine was the primary test machine for this issue.
Link one hovers between high 30's and low 40's (38-42) Link two hovers in mid 30's (34-35) @jonathanolson
@ariel-phet, should I look into further refinement (or moving the Scenery work to master) for this? The initial work includes a lot of room for optimization, and without that it looks like it was ~5fps faster.
I guess I was anticipating more dramatic improvement with WebGL. I gained around 5fps with the straightforward optimizations that I mentioned in https://github.com/phetsims/gas-properties/issues/146#issuecomment-515237667. And I could get more dramatic improvements by combining for
loops (at the expense of code encapsulation and maintainability, so would prefer not to).
@jonathanolson do you have a WAG about what kind of improvement further optimization might yield?
I guess I was anticipating more dramatic improvement with WebGL.
Same. I'm curious if it's still being somehow CPU-limited. I'll profile a bit.
do you have a WAG about what kind of improvement further optimization might yield?
Not not sure, I would expect higher performance.
@jonathanolson and I discussed briefly on slack.
I suggested putting in approximately 4 more hours to introduce some optimizations, then lets have @KatieWoe reprofile (I can test on my machine as well). If all look good then we can move to master, but lets get it a bit better first.
Hi @jonathanolson, I'm new to this issue but came across it as part of https://github.com/phetsims/wave-interference/issues/331, linked above. Does your proposed branch support clipping? If so, we may be able to use it to attain good performance and proper clipping for the sound particles on Wave Interference for iOS.
Looking on master (8/7/2019) and something has definitely caused a big hit to performance. It happens on firefox too at least, so this may not be the best place for it, but adding particles/a sizable number of particles in the box can drop the fps to 7 and lag is very noticable.
@KatieWoe questions:
?ea
requirejs, with, not that I noticed
Thanks @KatieWoe. We established in https://github.com/phetsims/gas-properties/issues/144 that running with ?ea
results in a performance hit because of PhET-iO, and that we're willing to accept that performance hit until we investigate PhET-iO performance.
Can you please verify that performance is OK without ?ea
.
Confirmed. Sorry about that.
No problem, my heart rate has returned to normal :)
Does your proposed branch support clipping?
@samreid it does not currently support clipping. This could potentially be added with partial support (for axis-aligned rectangles), but the general issue would need to wait until we have code designed to draw paths for WebGL.
I suggested putting in approximately 4 more hours to introduce some optimizations
I added some optimizations, and merged the scenery work to master. However I'm curious about a few performance-related items that came up when reviewing performance traces for the given instructions:
https://phet-dev.colorado.edu/html/gas-properties/1.1.0-sprites.2/phet/gas-properties_en_phet.html is available with the optimizations made for sprite drawing.
@ariel-phet, let me know if I should look into the histogram-related performance, or what else I should do here.
@jonathanolson I don't think you need to do anything else here in the near term. The sim reached acceptable levels with other optimizations.
@pixelzoom you may want to look at a few of the things above, in terms of optimizing performance if we do a maintenance release in the near future (if it is low hanging fruit).
@jonathanolson said:
There are ParticlesNode instances in IdealGasLawParticleSystemNode. Is this for improved sim design, or for performance? (Having just one would be better for performance, and since the "outside" one always exists, the Canvas implementation drew over a larger area than is required).
There was originally a single Canvas for all particles (inside and outside the container). Consulting with @jonathanolson on 5/20/19, he told me that it would be more performant to have 2 size-optimized Canvases, rather than the single canvas. See https://github.com/phetsims/gas-properties/issues/38#issuecomment-491454072. So IdealGasLawParticleSystemNode
was reimplemented in https://github.com/phetsims/gas-properties/commit/9f98e538c307680e13172d1556f4f333fdedc3a3. Should I undo that work?
https://phet-dev.colorado.edu/html/gas-properties/1.0.0-dev.24/phet/gas-properties_en_phet.html?canvasBounds shows the bounds of the 2 Canvases after the change. The "outside" canvas is dynamically sized to the area of the ScreenView above the container. The "inside" canvas is statically sized to the maximum area of the container.
Before the change, ?canvasBounds
was not supported, but the bounds of the single Canvas looked like this, and was dynamically sized to fit the ScreenView:
The large amount of CPU overhead is related to the histograms. Around half of this was bounds-related (overriding
computeShapeBounds()
for those Paths could significantly improve sim performance), and the other half looked to be computation of the histograms themselves (HistogramsModel and the functions it calls).
Good to know that there's some performance to be gained in the histograms. But @KatieWoe's report that resulted in this issue was for the Ideal screen, which does not involve histograms, see https://github.com/phetsims/gas-properties/issues/144#issuecomment-512569554. That said...
Please provide details of which specific Path(s) you're referring to, and how you recommend overriding computeShapeBounds
.
I'll take a look at the histogram computations again in HistogramModel
to see what might be optimizable.
There was originally a single Canvas for all particles (inside and outside the container). Consulting with @jonathanolson on 5/20/19, he told me that it would be more performant to have 2 size-optimized Canvases, rather than the single canvas. So IdealGasLawParticleSystemNode was reimplemented in 9f98e53. Should I undo that work?
It would benefit only if the "outside" CanvasNode isn't included with its bounds all the time. Right now, even if there are no particles outside, Scenery still needs to have its actual Canvas covert the entire upper region. I'm happy to collaborate on this, and ?showFittedBlockBounds
should display where the actual Canvas is.
Good to know that there's some performance to be gained in the histograms. But @KatieWoe's report that resulted in this issue was for the Ideal screen, which does not involve histograms, see #144 (comment).
I had been testing the reproduction instructions in this issue, specifically https://github.com/phetsims/gas-properties/issues/146#issuecomment-513287180 (which has the histograms on)
Please provide details of which specific Path(s) you're referring to, and how you recommend overriding computeShapeBounds.
I didn't fully dig in to investigate which Paths were causing the most overhead. The pattern followed so far is path.computeShapeBounds =
with a function that will return a Bounds2 that should always include the path's contents.
Let me know if it would help to collaborate on the Canvas or Path issues.
I reviewed HistogramModel
and don't see any obvious opportunities for optimizing algorithms. I'm guessing that most of the cost is in array allocation for samples and bin counts. And changing that (e.g. preallocating and reusing arrays) would be nontrivial, because we have arrays of arrays that are smoothed over a sample period.
I'm also wondering about whether I should be using Shape.makeImmutable
. From Slack:
Chris Malley 11:25 AM I see a lot of calls to
Shape.makeImmutable
in vector-addition. Is this a performance optimization that I’m not aware of? If I know that a Shape will not be mutated, should I be calling makeImmutable?Jonathan Olson:house_with_garden: 1:17 PM If it won't be mutated AND will have a longer lifetime than (or be used by) multiple Paths, then it's probably good. Otherwise, a Path needs to add a listener to the Shape to see if it will be changed (which isn't necessarily a huge amount of overhead, but could be a source of memory leaks)
I don't totally understand @jonathanolson's response, so will add this to the list of things to discuss with him.
I discussed on Zoom with @jonathanolson. I'm going to do these optimizations, in this order, to verify that they have a positive impact:
(1) Path.computeShapeBounds
(2) ParticlesNode
(3) Array optimization for HistogramModel
(4) ParticlesNode
Optimizations (1) and (2) will benefit the Energy screen only, since they are related to histograms.
Also noting that we discussed Shape.makeImmutable
. It's not relevant here. It's useful when you have a persistent Shape that is used with multiple Paths.
The commits above implement optimizations (1) and (2) described above, in master and 1.0 branches.
@KatieWoe could you please verify whether there is a performance improvement on Win10+Chrome? Please compare 1.1.0-dev.1 to 1.0.0, running with ?profiler
, using the Energy screen with both histograms expanded and all checkboxes checked.
I'm seeing ~60fps in both versions on macOS + Chrome (MacBookPro15,1), so it's difficult for me to tell whether these optimizations had any impact.
In https://github.com/phetsims/gas-properties/issues/144#issuecomment-512569554, @KatieWoe reported performance on a range of platforms (abbreviated below). Frame rates are for the built version running with
?profiler
.Some of these should be examined before RC testing, to see whether they are OK or require attention. The most curious one is Win10 + Chrome.