playcanvas / engine

JavaScript game engine built on WebGL, WebGPU, WebXR and glTF
https://playcanvas.com
MIT License
9.52k stars 1.33k forks source link

Extend dynamic batching to support skinned meshes #3543

Open mvaligursky opened 2 years ago

mvaligursky commented 2 years ago

This could be a good way to gain some performance when rendering many skinned meshes with the same materials.

Existing PR could form a base of this: https://github.com/playcanvas/engine/pull/1618

sakidev commented 2 years ago

Thanks for looking into this again @mvaligursky, speaking to other devs that want to use PlayCanvas to make MMO-like games we would really appreciate it happening, our MMO characters do need to get batched or our audience's devices will never make up to it in a scenario when more than 20 players are in the same spot (imagine something like World of Warcraft with hundreds on the same spot!).

I've been trying to learn how the new BatchManager works for the past week and trying to implement this myself, but i haven't got anywhere solid yet, and quite frustrated with it in fact. So knowing that you're also on it relieves me a lot, can't wait to get this feature into the engine and be able to build great games where lots of people can socialize together 😄 ❤️

mvaligursky commented 2 years ago

For now I've only created an issue (one of 400 we have on the engine), but I am not actively working on it :(

mvaligursky commented 2 years ago

But to be honest, if you want to run lots of animated models, the draw call cost is not the first target to optimise. Usually the cost of animating / updating bones and other things is at least as costly. Batching also restricts you to use the same material on your models, which is not common for many games. It is common for RTS kind of games, but to get good performance there, you'd need better custom solution, batching will make a small difference.

sakidev commented 2 years ago

Yeah you are right. I did some tests with an older build of the engine using Glidias' implementation, and batching 200 characters was surprisingly deceiving as i got 20fps on my phone, I was expecting more.

What would you advice on this? What would the way be to optimize the cost of animating. Updating the bones less times per frame? I've seen this happen on MMORPGs when characters are far away enough.

Do you know any other ways?

On Wed, 6 Oct 2021, 22:32 Martin Valigursky, @.***> wrote:

But to be honest, if you want to run lots of animated models, the draw call cost is not the first target to optimise. Usually the cost of animating / updating bones and other things is at least as costly. Batching also restricts you to use the same material on your models, which is not common for many games. It is common for RTS kind of games, but to get good performance there, you'd need better custom solution, batching will make a small difference.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/playcanvas/engine/issues/3543#issuecomment-937078392, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL3E5WI4RBEGKPZR5OLKP5TUFSW7BANCNFSM5FL5ACUA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

Maksims commented 2 years ago

Definitely less updates of skeleton, even up close when it is a highly packed. MMOs do that, indeed. Also LoDs can affect skeletons also, not only models. No need to animate fingers or feet for far away characters.

But tbh, 200 characters, on mobile, that is ambitious in general.

yaustar commented 2 years ago

Fewer bones where possible in the character and if you aren't using the AABB override on the model/render component for the culling, that can help too

https://developer.playcanvas.com/en/api/pc.ModelComponent.html#customAabb https://developer.playcanvas.com/en/api/pc.RenderComponent.html#customAabb

mvaligursky commented 2 years ago

Do you know any other ways?

You could consider a system that uses instancing. At the game start / or even better ahead of time, sample all animations at say 10fps and store the data in a single texture (per character type). Then use instancing to render them. Typically with the instancing you store a matrix to place your mesh. In addition to this you'd need to store some index into your animation texture - most likely just a single float value which would place your current time between two sampled times. Then you'd use custom vertex shader - which would do skinning, but using this animation texture. It would based on the specified time sample data for two stored keyframes - one before, one after the time, and linearly interpolate between them. Build a matrix from this position, rotation and skin the vertex. So all animation work would be moved to GPU this way.

Thinking about it .. it'd be fun to write a quick prototype for this. Maybe one day.

leonidaspir commented 2 years ago

And if your game design allows it definitely disable/pause animations on non visible models.

I think this optimization isn't happening by default.

mvaligursky commented 2 years ago

Good call @leonidaspir . We discussed it as a feature when designing the system, and it's a good time now to consider adding it soon. I've created an issue to track it: https://github.com/playcanvas/engine/issues/3563

leonidaspir commented 2 years ago

On that note, another optimization that my be easy to add to the anim/animation component update loop, in engine:

That way a user can easily implement a LOD system that updates far away skeletons on a lower frequency (slower). Right now to do that requires patching the onUpdate method of the anim/animation component systems.

Here is a sample patch for the onUpdate method:

                    const animationComp = component.entity.animation;
                    // --- check if we have requested a fixed timestep for this component
                    if(animationComp.fixedTimestep > 0.0){

                        if(!animationComp.fixedTimer) animationComp.fixedTimer = 0.0;

                        animationComp.fixedTimer += dt;

                        // --- check if we should be updating it
                        if(animationComp.fixedTimer < animationComp.fixedTimestep) continue;

                        // --- update animation using the total delta                        
                        dt = animationComp.fixedTimer;
                        animationComp.fixedTimer = 0.0;
                    }
mvaligursky commented 2 years ago

But tbh, 200 characters, on mobile, that is ambitious in general.

If you can live with many clones of the same character, this is doable using something like this: https://medium.com/tech-at-wildlife-studios/texture-animation-techniques-1daecb316657

I've implemented this on mobile a while back, and in the empty scene this could render 10k of individually animated characters (with unique colors / textures, but the same vertices), around 500 vertices each.

ertugrulcetin commented 1 year ago

Hi @mvaligursky @yaustar 👋🏻 Is there any update on this one? It'd be really cool to have this optimization.

yaustar commented 1 year ago

@ertugrulcetin I'm afraid I don't have anything on my end. Martin may have some extra thoughts but won't be back in the office till next week.

So far, this has been the approach that a number of developers are doing https://github.com/playcanvas/engine/issues/3543#issuecomment-940980765

Much of the cost in animation is the matrix calculations of the bones. See https://forum.playcanvas.com/t/gpu-skinning/28034

ertugrulcetin commented 1 year ago

Hi @mvaligursky 👋🏻, just curious is this feature/optimization in the pipeline?

mvaligursky commented 1 year ago

Not in the short term. As I mentioned above, I'm yet to see a real world case where I think this would be more beneficial than other optimisations we should do. What is your user case @ertugrulcetin ?

ertugrulcetin commented 1 year ago

Rendering 50 animated models in a scene/map, creating parties group of 5 characters so there will be 10 parties max. Like they're going to fight (particles, some skills etc.)

mvaligursky commented 1 year ago

So let's say that will create 50 render calls (without any batching). Are you finding this to be a bottleneck, compared to the cost of animating and flattening the bone arrays for those 50 characters? Based on profiling I've done, I would expect those to be a lot larger costs here worth optimising.

ertugrulcetin commented 1 year ago

I see, if the triangles count too high due to the number of characters (and their models' nature) isn't this problematic for the performance? (I assume this ticket also solves that if I'm not wrong?)

cost of animating and flattening the bone arrays for those 50 characters...

Also, is any optimization planning about these costs in the pipeline?

mvaligursky commented 1 year ago

This ticket only solves the number of batches problem, which is a cost on the CPU. It does nothing with triangle count, which is an art problem, that affects GPU.

There is an optimization for off-screen characters planed for the future, but nothing else at the moment. We've done a pass on character optimization about a year ago and its in a pretty good state.

ertugrulcetin commented 1 year ago

Thanks for the update!

sakidev commented 1 year ago

Hey, just wanted to pop up here because I just tested how to do "wearables" on PlayCanvas, and it's been incredibly easy with the new animation system. Dunno who wrote it but hats off to you, great work!

Now we can have for example gloves on a character when they pick them up in-game and it works with the same skeleton as the base one - before, we had to duplicate skeletons and bind the transforms to the base one and that would obviously cost performance in the long run!

willeastcott commented 1 year ago

Thanks for the feedback, @sakidev - the lead dev on the animation system is @ellthompson and I agree, he's done an amazing job!! 🚀