c-frame / aframe-physics-system

community-maintained fork of n5ro's aframe-physics-system
https://c-frame.github.io/aframe-physics-system/
MIT License
43 stars 11 forks source link

Extensions to performance stats #17

Closed diarmidmackenzie closed 2 years ago

diarmidmackenzie commented 2 years ago

image

Various extensions to performance stats for both Ammo & Cannon drivers

Note that, particularly for Ammo.js, I don't really understand all the stats reported, and the state of Bullet documentation means I think it would be pretty hard to get a good grip on these.

Even if the stats aren't fully understood, I think they have potential value:

While it would be great to understand and explain these stats better, I don't think that's feasible in the short term (not for me, at least), but I think it's worth offering these stats up in this form in any case.

Preview of the code is hosted at https://diarmidmackenzie.github.io/aframe-physics-system

See perf, stress and sweeper examples.

kylebakerio commented 2 years ago

wow, what a great feature. thanks so much for this!

kylebakerio commented 2 years ago

note: links aren't relative, so clicking https://diarmidmackenzie.github.io/aframe-physics-system and then clicking through to examples leads to the c-frame master examples instead of your local ones. may want to update those while you're here.

for my own future use, and for anyone else, here are direct links to perf examples: https://diarmidmackenzie.github.io/aframe-physics-system/examples/ammo/perf.html https://diarmidmackenzie.github.io/aframe-physics-system/examples/cannon/perf.html

Formatting isn't behaving great for me: image

notice how AVG overflows on the panel on top left; also "ave" instead of "avg" on the black background

kylebakerio commented 2 years ago

image

Note that this overflow happens even with 2k width provided to the window.

kylebakerio commented 2 years ago

I also find the units and information unclear at this stage, even though in the last week or so I had looked at the code. Probably just one sentence at the top of that black section there could go a long way to explaining what the chart is. (To be clear: talking about the black chart here, not the stats on the left, which I'm fine keeping in their current form)

That, along with changing Tick (msecs) to something clearer; maybe I'm missing something obvious, but this is not clear to me. Shouldn't it be ms / tick?

Should we not also be including the average tick length here somehow as well to make this contextually easier to understand?

kylebakerio commented 2 years ago

Also, would be nice if min/max/avg were in the same order in both panels

kylebakerio commented 2 years ago

All that said: while this is obviously great, fwiw when I asked for min/max/avg, I didn't mean per 100 ticks, I meant of the per-100-tick-avg values themselves--in other words, the average, max, and min of values recorded while page is open. (Even more ideal would actually be if this was graphed, but maybe it's something I'll add in the future.)

If I think about the value of these numbers, or how to use them...

  1. I think median would actually be more useful than average here. I want to know how long it takes in a typical frame, and median gives me that better than average.
  2. I think accumulating these values while the page is open is more useful. A steadily updated median and max values that are recorded, and some kind of overall-performance number like '<2ms 99.9% of ticks' or something would be useful.
  3. I think an ideal that is out of scope would be showing two curves:
    • a distribution graph of ms taken per tick; x is ms taken, y is recorded ticks that took that many ms
    • a graph showing the fluctuating values live; x is time, y is ms taken
    • preferably these are bar graphs that show before/engine/after in different colors making up the bar of each tick

the first would make it clear that while the average may be '2', that could be because it almost always takes .5 and you periodically lose a whole frame, which is a very different result than '2'. also really useful for debugging if that type of scenario is indeed the case, figuring out what the spikes are, as opposed to being falsely over complacent about the acceptable average, or helping know that there's an ideal target for substantial performance improvements.

the second would similarly be helpful processing the numbers, as a stream of numbers is hard to watch and grok, while a heartbeat of resource usage is much more intuitively obvious.

Not saying we should add graphs right now--and we don't need to, this is a great foundation to add that in the future--just sharing thoughts that come to mind.

But I do think having overall recorded performance while page is open would be ideal, and less overwhelming than recording 12 new raw numbers per 100ms.

that distribution graph should ideally be cumulative for all ticks in this demo, but in real world dynamic projects when profiling it would be more useful if it showed e.g. the last 1 second's ticks or something along those lines I imagine.

kylebakerio commented 2 years ago

seriously, though, I might write up a little tool to add these graphs to a low overhead basic canvas element after this pull request. Might want to talk about adding these into a panel so that they can be used in the real world in people's apps.

kylebakerio commented 2 years ago

I went ahead and put together a micro chart library to make those two graphs: https://glitch.com/edit/#!/custom-canvas-graphs

Here's plugging in with some real data, this is ammo on left and cannon on right:

image

one histogram on the bottom is only displaying the averages. I'm going to add another to displaying the distribution of max values here in a sec.

for the top chart, it's showing time spent per tick--orange is 'rest of the tick', blue is 'engine', purple on bottom is 'before', and there's a tiny green between the blue and orange that is 'after'.

While this is already interesting, I think 'median' would be ideal, as well as some way to get all the values to build up that distribution graph.

But I also feel like this adds a lot of clarify to what is really going on here and helping people get a feel for the data and how to really compare these performance numbers.

min/max/avg, descending, for ammo

image

the data is a bit misleading in this form, though, because the 'max' and 'avg' are weighted evenly (visually), even though one includes the other as well as represents the weight of many. ideally every individual value would go into that graph, and things like median, average, max, min, would all intuitively be visible in the one graph.

probably a separate pull request? the lib is a quick hack and needs some cleaning up, regardless... but it might help think about what data you're trying to communicate?

not sure if it should become a tool here, should be its own separate micro lib, should be a pull request as a feature in your stats panel extension lib you just built, or should just be thrown into these pages for now directly, or what, but whatever it is I do like the idea of having it both for profile the engines performance in these demos and for profiling apps used live.

diarmidmackenzie commented 2 years ago

@kylebakerio Thanks for all that, lots there to respond to...

In no particular order...

Netting all this down, a proposed set of next steps:

A couple of changes that would be pretty cheap to make, but I'm proposing not to do.

I think that's covered everything you raised - let's talk further if you don't agree with this plan.

diarmidmackenzie commented 2 years ago

One other point worth commenting on:

note: links aren't relative, so clicking https://diarmidmackenzie.github.io/aframe-physics-system and then clicking through to examples leads to the c-frame master examples instead of your local ones. may want to update those while you're here.

I don't know how to solve this. There's a good reason the links aren't relative - they wouldn't work in both the README view in GitHub and also on GitHub pages.

But if they are absolute, they need to updated every time you fork / merge. Which would quite likely lead to broken links in master (whic IMO is worse than broken links in a preview of a PR).

The problem seems to be documented here: https://stackoverflow.com/questions/40196198/automatic-link-to-github-pages-url-from-readme-md

It's possible that this GitHub Action could provide a solution. There don't seem to be any other solutions available. https://github.com/marketplace/actions/dynamic-readme

kylebakerio commented 2 years ago

Yeah, this all makes sense to me and I largely agree. I think we can just run with this for now, but I do think that a good single graph helps one grasp pretty much all of the relevant numbers at once and makes for a great way to get a 'feel' for the performance characteristics and differences.

That said,

Let's continue this discussion about graphing stats - but let's do it in the context of how to make best use of all stats we might get out of A-Frame + its components, rather than just aframe-physics-system.

In agreement there, will just table it for now and keep thinking about this, probably will just roll the graphing options into vr-super-stats and go do some upgrades there; perhaps your aframe stats component should be absorbed into vr-super-stats and live there?

As for this PR:

I'm not sure about this, but a thought I have... what if instead of events, there was an option to define a function that, if present, would be fed values as arguments? In fact, it doesn't even have to be 'instead of' events, you could leave this all here, but it would be nice to be able to specify a function that should be called on tick and just receive all the raw data to do with it whatever it wants. For example, if I wanted to have a distribution graph that accurately accounted for every tick, and then render it once per 100ms, the current interface doesn't really directly expose that as an option to me. You could perhaps change the 100ms to a configurable value, but it doesn't seem great to be emitting events on every tick like that as a primary interface, either... which is why I wish I could just pass in a function with an expectation that it would have relevant arguments passed in and I could work from there.

Probably pretty easy to implement, but I can also just go back and implement it myself later if you don't feel like it.

Mean -> Median. I just don't think it makes much difference, and doesn't solve the real problem, which is that we'd like a solution that offers genuine flexibility to users in how they view their stats.

I think in this demo it isn't super different, but I think in some cases it could be significant. median will just give a better idea of what a typical frame time is, while average will have the max value represented fractionally within it. If you have 1 two 45ms ticks per second but all others are .1, median will communicate that they're at .1, while average will show you a value that could be ten times that amount. I think situations like that are not at all unrealistic in real world scenarios for this lib.

AFAIK, core A-Frame stats don't provide any sort of external access to the raw data generated - they just display in the panel in the top left corner.

Yeah, I think this is right, and it's pretty frustrating, and causes performance problems for vr-super-stats; I have to parse the HTML, which is terrible. That was the solution in the library I forked from, I didn't find a way around it, and never got around to basically re-implementing it from scratch, but that was my intention (either that or find some hacky way to grab the stats from a-frame somehow). I did eventually write some custom stats stuff I didn't release that was more along those lines, wasn't really production ready though and didn't get to finishing it. An overhaul on all of that has been on my to do list. I mention this in the readme; the impact full stats viewing has is high in e.g. standalone quest, which is why I focused on producing a 'performance mode'--if you want 'real' stats you have to turn off everything and use performance mode, otherwise the rest is helpful for grasping relative impact, but not getting truly accurate numbers. It can still be a very useful tool, but that has always bothered me.

Key question for me is how much of this belongs inside aframe-physics-engine, and how much belongs in a separate library or framework.

Yeah, I think the idea of building out the tooling in vr-super-stats/etc. and then implementing the tooling here in some profiling demos is the way to go, we're in agreement on that.

and I don't personally find myself making much use of the graphs in A-Frame stats...

I will say for raf/fps for example, it's helpful to have a a graph because a momentary drop in those numbers and go unnoticed, especially from say 90 -> 45, but the visual of a graph can make it much more obvious that happened even if you weren't staring right at that one number at the moment it happened. Same effect applies for many things.

diarmidmackenzie commented 2 years ago

I think in this demo it isn't super different, but I think in some cases it could be significant. median will just give a better idea of what a typical frame time is, while average will have the max value represented fractionally within it. If you have 1 two 45ms ticks per second but all others are .1, median will communicate that they're at .1, while average will show you a value that could be ten times that amount. I think situations like that are not at all unrealistic in real world scenarios for this lib.

OK, I'm convinced on medians, rather than means. Computing medians means storing off all the data, at which point we can compute pretty much any percentile, which is definitely useful.

But I don't like building this into aframe-physics-system - it's too generic to belong here.

I've extended aframe-stats-panel with another component stats-collector which can take a stream of events, and report summary stats including mean, max, min, median, percentiles on a configurable interval. Very basic right now, but could easily be extended to add flexibility.

See example here: https://diarmidmackenzie.github.io/aframe-examples/component-usage/stats-collector.html

Would be pretty simple to now integrate aframe-physics-engine with this, just fire off an event every frame with the data from that frame, and all the data analysis is done in stats-collector.

Just a couple of integration issues:

Any comments / feedback before I move in this direction?

I'd be very happy for aframe-stats-panel (including the new stats-collector to be brought into c-frame, and be extended with graphing functions, VR display etc.

One other comment that I should address in terms of implementation...

it doesn't seem great to be emitting events on every tick like that as a primary interface, either... which is why I wish I could just pass in a function with an expectation that it would have relevant arguments passed in and I could work from there.

In general I strongly prefer events over function calls for inter-component communications, unless there are very specific requirements that only functions can meet (e.g. synchronosity, or performance).

I suspect function calls are more performant than events, but I've seen 1000s of events fired per frame without affecting performance too badly. If we are only firing 1 or 2 events per-frame, and they only happen when stats are enabled, I really don't think this is a significant issue performance-wise.

diarmidmackenzie commented 2 years ago

Have pushed all the above into the PR. Now looking like this: image

diarmidmackenzie commented 2 years ago

@kylebakerio How do you feel about merging this?

Future topics for exploration (though not in this PR, I suggest) are...

Stats architecture:

What stats, summarizations, visualizations etc. are in fact most useful when analyzing physics perf?

kylebakerio commented 2 years ago

Love the updates! I can definitely make better charts with this data. I'm not aware if best practice forbids generating this many events, didn't see a clear discussion when I searched it. Should we still include the option to only generate the summaries, and provide the raw per-tick-event only if specifically requested? Not sure if it really impacts are not.

I wonder what's causing the left panel to still not render right for me? image

I'd still merge it with that and just fix it in the future, not going to hold this back over what amounts to a CSS tweak.

Also, some interesting data after some chart upgrades and using the new raw tick event:

image

Based on these numbers, cannon and ammo actually are looking comparable here, no? Look at the 'total' histogram distribution.

I think that was just a fluke (interesting, though--I wonder what caused it?) here are more typical numbers:

image

Note that I've added in contacts... the contacts in ammo is much higher than in cannon. Does that mean ammo is doing more complex physics and taking more into account? Or is this a quirk of how they report?

Also, in regards to your comment on contacts/bodies: as it isn't available on tick, I resorted instead to calling the countBodies functions manually, and grabbing the data manually. I am doing it every time I capture an event, fwiw. That mimics my original suggestion of how this might play out.

Just as you prefer events, I actually tend to avoid events, as event-based dev in large websites is something the industry moved past years ago, as it was common to end up with an unmaintainable mess of difficult-to-trace behaviors, as it slowly would become unclear how things integrate and what the cascading consequences of any given event might be, etc.

But that doesn't mean they can never be used, and they are kind of nice here; I just personally find functions a cleaner paradigm. A-Frame does seem to be organized at least partially around events, for better or worse, so it definitely feels semantically correct.

I'm not necessarily encouraging anything different, but if you want everything through an event based interface, that was data I wanted (in case those contacts spiking would be related to occasional spikes in engine time or wrapper time or something), and it wasn't available at a matching sample rate for me. (I can comment out the line that looks up the body counts and see if it is making an impact quite easily, too, which is nice.)

An option might be the ability to specify if you want the body data events, and for all stats events the option to specify a throttle?

kylebakerio commented 2 years ago

I've been experiementing with the charts and numbers, and... I think the collisions number is behaving kind of weird in ammo. See the behavior on initial load to see what I mean here: https://aps-graph-demo.glitch.me/examples/ammo/perf.html

It starts around 200ish, and then rises up to a peak of 457, and pretty much stabilizes just below there? Number doesn't have enough fluctuations to look natural, either, and the number rise doesn't match a rise in dynamicBodies, which was my first guess. Why the slow rise and higher value?

Also, a different view of the two engines with the 'rest of tick' removed: image

kylebakerio commented 2 years ago

Maybe they should be labelled 50th/ 90th/ 99th for consistency? I do think 90th is way more useful than min, so that's great to have.

Also, was able to trigger this state where ammo started performing similar to cannon again: image

This time I was throttling my cpu on both windows, and then unthrottled. While I didn't do that before, it's possible that changing window focus causes a similar effect (backgrounding a tab causes raf to be throttled in chrome).

kylebakerio commented 2 years ago

As for the bigger question, while the stats-collector idea is nice, it is opinionated and expects the component to be analyzed to have matching events, I guess? One can always just generate the events one needs oneself, though, of course.

I would just prefer a function I can call rather than emiting an event, and while in general I'm open to the event model, it feels to me like doing live stats in a performance sensitive environment is probably the one place where it makes the most sense to pick the more performant option.

Not suggesting we rewrite this right now, I think this is fine to accept as-is, but if we're talking about a model for stats collection for a-frame as a whole going forward...

A question I had when looking at the stats collector usage in this code was whether you have 3 or 4 different listeners, all listening for the same event? Or whether you intelligently pick up that multiple components registered the same event and collect them into one listener?

I think event-based architectures are seductive in the beginning but less maintainable long term, and do come with performance implications. Making sure the emitted of the events recycles their objects rather than generates new ones every time is another performance concern. The easy with which they can result in memory leaks is also a concern.

kylebakerio commented 2 years ago

I think A-Frame is not likely to add more stats infrastructure, but I think a good basic tool like that could live in C-Frame. That said, if we're doing it right, I'm not sure the A-Frame stats panel is really that ideal one true platform. If we built it as a single canvas item, I think that would be more performant than touching the DOM but would also make it very clean and easy to view those same stats as a single texture on a plane in VR, giving a consistent experience and a way to get interface for both while only writing once.

diarmidmackenzie commented 2 years ago

Thanks for all this - some responses, but in summary I think we're OK to merge this one now.

Should we still include the option to only generate the summaries, and provide the raw per-tick-event only if specifically requested? Not sure if it really impacts are not.

Given the events are only enabled when stats are turned on, and it's just one event / tick / scene, I don't think there's a need to turn it off.

I wonder what's causing the left panel to still not render right for me?

This is weird. I removed a couple of the spaces, which looks slightly less nice on my display, but should look better on yours. The headings should probably use some proper HTML layour elements, rather than trying to line stuff up with whitespace, but I can't see a way to do that without overcomplicating things. image

Maybe they should be labelled 50th/ 90th/ 99th for consistency? I do think 90th is way more useful than min, so that's great to have.

I stuck with Median, but changed Max to 99th. Internally it's still computed as "max", which gives a very slightly different result from computing the 99th percentile, but I don't think the distinction is worth dwelling on.

Based on these numbers, cannon and ammo actually are looking comparable here, no?

Maybe, but I have a couple of significant perf fixes for Ammo in the works that are relevant for this test, which should push it back ahead.

Note that I've added in contacts... the contacts in ammo is much higher than in cannon. Does that mean ammo is doing more complex physics and taking more into account? Or is this a quirk of how they report?

I don't fully understand all the numbers, and I don't think they are comparable betwen engines. Also, the Ammo ones look buggy to me. I don't understand why Collisions & Collision Keys diverge, and I don't understand how Collisions can be greater than Manifolds. I suspect a possible leak of Collisions objects, but haven't didn't want to get sidetracked by investigating that. Stats are a great tool for potentially spotting bugs with the physics system internals.

I resorted instead to calling the countBodies functions manually, and grabbing the data manually.

Makes sense. Events implies a "push" model for statistics. Given that the engine can't anticipate all the needs of someone investigating a perf issue, I guess there is a good case for offering a "pull" model, and that more-or-less implies a functional interface. On the other hand, you might want a push model in some cases - the physics engine knows exactly where it is in its processing cycle, so it knows the best time to gather up data... whereas someone external might request data at a bad time when data is poentially inconsistent.... Lots more to think about here in terms of use cases etc.

An option might be the ability to specify if you want the body data events, and for all stats events the option to specify a throttle?

Possibly, but until we have evidence that any of this is actually causing a perf impact for anyone, I don't think I'd bother.

While I didn't do that before, it's possible that changing window focus causes a similar effect

Changing window focus seems to cause a spike in Ammo physics engine processing. Interestingly PhysX seems a lot more stable in this respect.

A question I had when looking at the stats collector usage in this code was whether you have 3 or 4 different listeners, all listening for the same event? Or whether you intelligently pick up that multiple components registered the same event and collect them into one listener?

Not yet - I don't see why it couldn't be enhanced to work this way, but for now I've gone for simplicity over performance. I doubt there's a big performance issue with the way it's working at the moment - might be worth revisiting if we were tracking many more different stats on a single scene.

kylebakerio commented 2 years ago

Made a slight update to show the most captured value on the histograms to make it easier to compare, and split out before/after. Looking forward to seeing the ammo updates and comparing the results!

(Worth noting, I ran these simultaneously, refreshing both immediately and then jumping to a third window so neither would be 'in focus'.) (left is cannon, right is ammo)

image

kylebakerio commented 2 years ago

Ah, kept forgetting to mention:

There are two entities with id="pinboard" in your examples. just got reminded while looking at the physx example. also saw it here.

<a-scene physx="autoLoad: true; delay: 1000; useDefaultScene: false;stats: panel"
             tick-time-display="sceneOutputEl: #tickTimerScene">
        <a-entity id="pinboard" pinboard="physics: physx; height:20; width: 20" position = "0 0 -20" rotation = "45 0 0">
        </a-entity>

        <a-entity id="pinboard" ball-recycler="physics: physx; ballCount: 100; width: 15; depth: 15; yKill: -8" position="0 8 -20">
        </a-entity>
diarmidmackenzie commented 2 years ago

Ah, kept forgetting to mention:

There are two entities with id="pinboard" in your examples. just got reminded while looking at the physx example. also saw it here.

<a-scene physx="autoLoad: true; delay: 1000; useDefaultScene: false;stats: panel"
             tick-time-display="sceneOutputEl: #tickTimerScene">
        <a-entity id="pinboard" pinboard="physics: physx; height:20; width: 20" position = "0 0 -20" rotation = "45 0 0">
        </a-entity>

        <a-entity id="pinboard" ball-recycler="physics: physx; ballCount: 100; width: 15; depth: 15; yKill: -8" position="0 8 -20">
        </a-entity>

Fixed it.