Open djrain opened 1 year ago
I can confirm this on 4.1.stable (Linux, GeForce RTX 4090 with NVIDIA 535.54.03).
Clipping disabled | Clipping enabled |
---|---|
11471 FPS (0.09 mspf) | 1902 FPS (0.53 mspf) |
disabled
stretch mode)Clipping disabled | Clipping enabled |
---|---|
5711 FPS (0.18 mspf) | 1478 FPS (0.68 mspf) |
canvas_items
stretch mode, makes clipped sprites larger)Clipping disabled | Clipping enabled |
---|---|
4114 FPS (0.24 mspf) | 844 FPS (1.18 mspf) |
What's interesting is that while GPU usage goes up with clipping enabled, power consumption goes down (despite both cases running at the same GPU core and memory clocks):
Clipping disabled | Clipping enabled |
---|---|
It's quite a bummer to have such a useful feature hindered by poor performance :( @Calinou can we tag someone who might have some ideas?
@Calinou hi! Are there any updates on this topic? Will it be fixed?
This highlights an important difference between mobile and desktop GPUs. Due to a difference in architecture mobile GPUs pay a much larger performance penalty for every pixel touched and for switching render targets.
CanvasItem clipping requires switching render targets twice (once to back buffer, then back to front buffer) and the way it is implemented now requires a full screen copy (it touches a lot of pixels).
On desktop this is essentially free as the cost of switching render targets and copying pixels is super low.
Ultimately the render target switching can't be reduced so clipping will always be somewhat expensive on mobile.
Right now we always copy the full front buffer when doing the clipping, but we really only need to do that when mipmaps are enabled. We can use a tougher clipping rect to reduce the cost of copying the pixels, but I doubt even that optimization will be enough to make this efficient on mobile devices.
On desktop this is essentially free
But it doesn't seem to be the case, both Calinou and I observed a significant FPS drop on desktop?
My performance issues were coming specifically from CanvasGroup, not CanvasItem.
Perhaps there should be a 'warning label' in the Godot documentation about using CanvasGroups in mobile projects?
As a new transplant from Unity, I started 'intuitively' using CanvasGroups all throughout my project. It wasn't until significantly later that I realized it was killing my app's performance. I worry that a lot of other newbies will do the same thing, and that may lead to doubts about Godot's capabilities.
Thinking more about this, I can think of two things to explore to improve performance:
In the short term we may indeed want a clear warning in the documentation as the current design is very bad for mobile and that won't change without drastic intervention
@clayjohn Hi, tell me if I'm wrong, but what do you think about making a clipping inside SubViewport, to generate clipped image only once and display it as a texture?
@clayjohn Hi, tell me if I'm wrong, but what do you think about making a clipping inside SubViewport, to generate clipped image only once and display it as a texture?
That's definitely an optimization you can do today. It is slightly more cumbersome than using clip_children
directly. But it allows you to cache the results of clip_children
which will be a net win for performance. In cases where you don't need the clip_children
node and child nodes changing every frame, it is definitely best to cache the results in a SubViewport and apply the SubViewport's texture directly.
Could this be done by only updating when needed? For instance, only updating:
? Assuming there isn't something blocking this that I'm missing, it would make static art the default (and updating based on what is added) solving proposal 8747.
Also mentioned in the PR (cascaded) above:
instancing optimization would be very useful particularly for tilemaps and spawning, which need it due to rendering the same scene multiple times over (perhaps animations making that a bit more tricky).
MSAA really gives a massive limitation on clipping instances w/the PR (only rendering 22-25% of the instances compared to 4.2.2's implementation, at least for me w/a 1050Ti where MSAA is a major limit there) though I have not tested if downscaling could be a better alternative to get AA.
I would say that perhaps clipping could be skipped in certain scenarios, though that would probably be mostly used for eyes (when the iris is not near the edge) and maybe a few other character/dynamic things and probably not much else (most interesting art done with this will always need clipping). Though in cases where that is viable, it could be turned off by animating/scripting the value.
In a similar vein, could partial/incremental updates be a thing, particularly for less complex setups/untextured polygons?
Godot version
4.1 stable
System information
macOS, Android, iOS - all renderers
Issue description
We're making a mobile game, and we were getting poor performance on Android. After some testing we found that having even a small number of nodes with clipping enabled was the culprit, making this feature borderline unusable. Turning clipping off brought the game back up to acceptable FPS.
The slowness happens in all renderers and not just on mobile. Having more children being clipped does not seem to matter much - only the total number of nodes with clipping enabled.
I would normally assume that clipping might have a significant performance impact, but this blog post stated that clipping happens "at literally no cost", which sounded too good to be true, but based on that this result is very unexpected.
If a notable performance hit is indeed expected when using this, it would be nice to have documented.
Steps to reproduce
Run MRP main scene. Press enter to toggle clipping on and off, and observe huge FPS change. (note vsync is off) On my M1 Mac it drops from around 1300 fps to about 75.
Minimal reproduction project
ClipChildrenSlow.zip