godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
89.88k stars 21.03k forks source link

Testing results for 2D Unified Batching (GLES2 & GLES3) #42899

Closed lawnjelly closed 3 years ago

lawnjelly commented 3 years ago

Important

rendering->batching->options->use_batching_in_editor needs to be set to true for 3.2.4 beta 1. Sorry, this is due to a small bug on my part. Otherwise you will get a multicolored IDE! :grinning:

About

This issue is to collect user feedback about the new 2d unified batching for 3.2.4 implemented in #42119. This PR affects GLES2 users as well as GLES3, there are several new improvements to GLES2 batching, including new primitives, improved support for batching custom shaders, and software skinning.

To some extent this post mirrors information in #38004 as the unified batching is largely based on the GLES2 batching codebase.

See for details as to how batching works: https://docs.godotengine.org/en/stable/tutorials/optimization/batching.html

Introduction

3.2.4 introduces unified 2D batching support to GLES2 & GLES3. This is quite a major modification to the renderer, so we are expecting a certain amount of issues, it would be helpful to keep them here together so we can work through them.

In the old renderer, 2d primitives were drawn individually. This is simple to implement, but doesn't make best use of the speed of modern GPUs, which work far more efficiently when you tell them to draw a bunch of similar primitives at once. Batching works by placing an intermediate layer between Godot and the backend renderer, which groups similar primitives together so that they can be drawn at once. And this intermediate layer is now shared between backends, so that the same improvements that were made to GLES2 are now available to GLES3.

Note that batching is purely a performance optimization. If working correctly there should be no visual differences or differences in behaviour. It can work on several primitives including rects, lines and polys.

Expectations

Some games will benefit from batching more than others. In particular games using tilemaps or a large amount of text are most likely to benefit.

If you notice a slowdown in your game, it is highly likely to be because batching defaults to a slower method of drawing, which prevents flicker on nvidia hardware. You can switch this back to the old behaviour by setting rendering->batching->options->single_rect_fallback.

If you are using 2d lights you may also find it useful to tweak rendering->batching->lights->scissor_area_threshold. You can get significant improvement to speed using 2d lights in some circumstances.

Mac Users

If you experience performance problems, you are highly encouraged to try the experimental settings to change how OpenGL is used, and report back on your findings. Full details here: https://github.com/godotengine/godot/issues/43038#issuecomment-722336615

Additional

Along with performance improvements due to batching, the PR also introduces optional software skinning for 2d skeletal animations. This should hopefully greatly improve compatibility of skinning, especially on mobile phones, and may lead to speed increases. See rendering->quality->2d->use_software_skinning.

There is also the option to select between the current and old ninepatch rect behaviour, see rendering->quality->2d->ninepatch_mode.

Bugs

There are 2 types of issue you are most likely to see:

1) Performance issues

We are particularly interested if you have a project that runs slower with batching than without (and is not cured by turning on single_rect_fallback, see above). If we can isolate these cases and work out what is going on, we can fix them to prevent any slowdowns in the next version (and in some cases they can be accelerated).

If you get an increase in performance, let us know! :+1:

2) Visual regressions

If you encounter visual differences between your project with and without batching, that is something we need to fix. Please let us know full details.

A Quick Guide

Settings are described in: https://docs.godotengine.org/en/stable/tutorials/optimization/batching.html#project-settings

Tips

The order that objects are rendered generally follows the order in the scene tree. If you can group similar items together in the scene tree (similar material, texture etc), they are more likely to be batched together, and give you a speed increase.

Batching will be broken by (among other things):

The best batching project settings will depend upon the game. Be sure to try varying every one, to see the effect on performance. A simple way to measure FPS is to disable vsync, and set the project setting debug/settings/stdout/print_fps (although you should disable this for exports, as printing anything can affect performance itself).

Diagnose Frame

This is a very useful option in order to optimize your game (make sure to switch it off when exporting though, as it costs performance). It prints out every 10 seconds a list of the batches used in the frame. This list may be quite long for complex games, you may need to increase the buffer size in project_settings/network/limits/debugger_stdout/max_chars_per_second to read it all.

Details are in the documentation. https://docs.godotengine.org/en/stable/tutorials/optimization/batching.html#diagnostics

Template

As with most issues, it is useful if you provide us with the following info:

In addition if you are dealing with a slowdown, it can be helpful to provide (or link to) your diagnose_frame log.

If you are reporting increase in performance, only minimal info is needed (is totally up to you). :+1:

fire commented 3 years ago

How does one keep 2d batching?

In https://github.com/fire/Godot-Control-Masking we were afraid it would cause batching to fail, since the mask method runs the gpu shader on all the children. Even if masking was an engine module it would be effected.

lawnjelly commented 3 years ago

It is hard to say in advance, but if you run with diagnose_frame you will see whether they are batching. If it cannot, it will revert to the old method of drawing.

oeleo1 commented 3 years ago

All this sounds terrific. At first glance, it looks terrific too. :-)

image

The [redering] section of the project.godot has these among others: ... quality/driver/driver_name="GLES2" ... batching/options/use_batching=false batching/options/use_batching_in_editor=false

If I quit, remove (manually) the line about use_batching_in_editor=false and restart, the UI looks normal.

At startup (Windows 10, 64-bit) it says invariably that batching is on though:

Godot Engine v3.2.4.beta1.official - https://godotengine.org
OpenGL ES 2.0 Renderer: Quadro T2000 with Max-Q Design/PCIe/SSE2
OpenGL ES Batching: ON

At your service to help debugging it.

PS: Too many visual regressions to report... Not sure whether they shall be discussed in this thread or in a bug report.

PPS: With lovely different colors on every launch. I already start regretting this being fixed ;-)

image

bitmapman5 commented 3 years ago

Just wanted to pop in and quickly say thanks for this! I've been working on a dwarf fortress-like colony sim in Godot 3.2.3 stable and had resigned myself to using GLES2 for the batching. With this PR it looks like rendering the massive pile of tilemaps in my game is in the neighborhood of 6x faster on GLES3, about the same speed as GLES2 on 3.2.3. I haven't noticed any issues so far but I'll come back and make a proper report if I do. Thank you!

IncubateGames commented 3 years ago

GLES3: Batching: off image

Batching: on image

Windows 10 i5 3570 16MB DDR3 GTX1060 6GB oc

IncubateGames commented 3 years ago

GLES3: Batching: on single_rect_fallback: on image

single_rect_fallback: off image

Windows 10 i5 3570 16MB DDR3 GTX1060 6GB oc

lawnjelly commented 3 years ago

@Oskit-Producciones That sounds good.

The drop in performance for single_rects is totally expected. The 'safe' routine is 2-3x slower. However on balance in most projects there will be a lot of batched primitives, which will more than make up for this. And you can always turn the single_rect_fallback on (if you don't mind some of your uses having flicker).

@oeleo1 I'll try and work out what is happening for you. Good news is though that 'big regressions' tend to be as easy or easier to fix than small regressions (strangely enough).

OMG I can't believe it I get the same oeleo when I set batching to false and use_batching_in_editor to false! :grin: I can't believe I never tried that combination before.

It also occurs in GLES3, whenever batching is turned off in the editor. I'm stumped as to why we didn't notice this before, as we've been turning this on and off through development (and it working fine). So a recent change must have caused this, it may be an uninitialized settings variable, or something external to the renderer. Will find it soon! In the meantime, use batching in the editor haha! :smile:

EDIT : Found it

I think this is down to some last minute changes I made to buffer orphaning in the legacy renderer to try and fix performance issues on Mac (nothing to do with batching), I thought these were pretty benign but obviously didn't test enough. This should not be a big deal and easy to fix or revert. It may just be affecting legacy polygons (i.e. mostly the editor with batching switched off).

Yes confirmed, it's the reduction in glSubBufferData calls in drawing the legacy polygons, it's only a few lines, and reverting it works fine. Sorry my bad! :blush:

I'll try and work out why the new method is failing - ah got it, the input buffers are separate, rather than one block. So the optimization can't be used.

Fixed

PR is linked. However that does mean you guys will have to have batching on in the editor for this beta, sorry about that.

ghost commented 3 years ago

when I set batching to false and use_batching_in_editor to false!

That's a pretty wild one. I feel for you here. 😆

Can also confirm Windows 10 - OpenGL ES 2.0 Renderer: GeForce GTX 1660 SUPER/PCIe/SSE2

shajid10 commented 3 years ago

metroidvania.zip

Godot Version: 3.2.4beta1 OS: Endeavour OS Device: HP 2000 Notebook GPU: Intel HD 4000

After adding a wind shader to the tilemap, the tilemap disappears randomly at random points. The project file is included. The tiles are visible if batching is turned off. But they start flickering after batching is turned on.

Steps to reproduce:

  1. Open the project
  2. Run the project
  3. You can't see the grass without moving to the top platform. Press L to jump
lawnjelly commented 3 years ago

metroidvania.zip

Yes I can replicate this here. I've been adding some more code in 3.2.4 to deal with more custom shaders, so I expected there to be more regressions to fix here before everything is running smoothly.

The diagnose log shows it seems to be splitting it up correctly, so that is good. It also seems to work with a fragment shader, but touching the VERTEX position causes the showing up / not showing up. This is one of the new paths where it stores the parent transform and the local vertex in the flexible vertex format. I may need to debug this when I get home in a few days, but I'll see if I can work out what might be going wrong.

Ah from moving around the scene in the editor, the grass doesn't disappear, it shows really big in the wrong place. So something in combining the transforms in the new path is mucking up. Yes even the wavy grass is working, only the transform is wrong. It is possible it is applying the parent transform to the uniform AND in the shader (i.e. twice) which could be causing the problem. Think this should not be a big deal to solve.

lawnjelly commented 3 years ago

metroidvania.zip

Ok I am back home and able to compile debug etc and have this fixed now. :+1:

As I suspected it was doing the transform twice (it uses hardware transform mode under some circumstances, such as only 1 rect in a batch), and it also now stores the parent transform in the new large flexible vertex format for dealing with custom shaders.

You may find that if you used the same shader on a batch containing several rects with the same texture, it would work (because hardware transform not used). That's probably how I initially tested it so didn't notice this problem.

The solution was to disallow hardware transform under all circumstances when using large_fvf. I still want to go over it tommorrow when I've had a nights sleep before making a PR, just to check there is no side effects, or a better way.

This is encouraging that the custom shader batching seems to be working. That was one of the major weak points of the 3.2.2 batching. I'm hoping most custom shaders should be batched now so even advanced games will benefit.

p.s. Just for academic interest you will get better batching for your foliage if you combine them into a single texture atlas. (You can see this if you turn on diagnose_frame in batching, and hide everything but the foliage). I shouldn't worry too much as your drawcalls are now quite low, so you are getting most of the benefit already.

A tilemap using interspersed rects with different texture will not get reordered currently. It will do reordering for items, but tilemaps are made up of lots of commands in one item (which is part of what makes them efficient) but the downside is no re-ordering. Well for now at least.

QbieShay commented 3 years ago

I get a crash when opening a certain scene. I'm using gles3, batching and batching in editor are on.

Here is the repro: repro-batch-gles3.zip.zip

Navigate to effects/enemies/projectile/InfectionProjectile/BulletEffectInfection.tscn. Open the scene, editor crashes with SIGILL (4)

PC specs: OS: Linux 5.9.1-arch1-1 GPU: Radeon 590, driver amdgpu

akien-mga commented 3 years ago

@QbieShay Thanks! Might be the same bug as #42994 (fix merged a few hours ago) as I can't reproduce the crash in latest 3.2 (efdc5f0f899a8313cd557f87cc7d478692ebdb72 + some cherry-picks I haven't pushed yet).

QbieShay commented 3 years ago

@lawnjelly @akien-mga it isn't happening anymore on 0b6d

lawnjelly commented 3 years ago

@lawnjelly @akien-mga it isn't happening anymore on 0b6d

Yes I think the fix for this was #43102. There's just the faintest possibility of a similar problem with polys but I've not been able to produce it and had no reports (it might be that polys are handled fine by the existing logic).

EIREXE commented 3 years ago

Trying latest 3.2 (adf2dabbd) I get some strange artifacting around some of my graphics, they are controls being scaled in and out within a viewport rendered on top of a meshinstance.

This happens with UV contract and on windows and linux.

Big performance jump though, particularly from character drawing.

https://youtu.be/tUUygHPq2yI

lawnjelly commented 3 years ago

Trying latest 3.2 (adf2dab) I get some strange artifacting around some of my graphics, they are controls being scaled in and out within a viewport rendered on top of a meshinstance.

This happens with UV contract and on windows and linux.

If you are able to make a minimum reproduction project, I can investigate. It does look like a filtering / precision issue.

EIREXE commented 3 years ago

Trying latest 3.2 (adf2dab) I get some strange artifacting around some of my graphics, they are controls being scaled in and out within a viewport rendered on top of a meshinstance. This happens with UV contract and on windows and linux.

If you are able to make a minimum reproduction project, I can investigate. It does look like a filtering / precision issue.

Hmm I think it's gonna be hard but i'll give it a go, also I found some strange geometry that's drawn by the game (but doesn't seem to be visible) with batching enabled, not sure what that is or if it may even be my fault.

imagen

(left is batching on, right is batching off)

EIREXE commented 3 years ago

I've tried reproducing the issue myself by isolating menu assets on a new project and I can't seem to make it happen on an isolated project, only inside my game, which is quite odd.

I also tried disabling some nodes and it appears like that changes the artifacting quite a bit:

https://youtu.be/ZXRjHJuFbxU

lawnjelly commented 3 years ago

Sometimes it can be difficult to deduce what is needed for a minimum reproduction project, but that is really the first step that is needed, especially in game project with a lot going on, so I would encourage you to keep at it. If you can't reproduce it, it is far less likely I will be able to, without knowing how your scene is arranged etc.

I would do exactly as you are doing .. try and hide various branches of the scene tree in your game project till you can find the minimum to produce an artifact, then try and reproduce it in a simpler project.

Some things which are some of the possible culprits:

It is likely there is more than 1 bug, the filtering / precision / wrapping looks different to the extra geometry bug. Geometry being drawn in the wrong place is a common bug because of the multiple matrices that are used in godot pipeline. In your second video lines seem to be coming out of the top left screen corner, which is odd, unless you are using that for an effect? Also are you using any procedural geometry (i.e. you create the verts yourself)? Also are you using any 3d (e.g. drawing a viewport on a 3d sprites etc?)?

EIREXE commented 3 years ago

I have managed to make it happen on a test project, seems like the shader used is the culprit.

BatchingArtifacting.zip

lawnjelly commented 3 years ago

When I run your example I don't seem to be seeing any regressions (Linux Mint 18.3, Intel HD graphics 630). If you can confirm you are seeing something different, what GPU etc are you using?

That's assuming I don't have to do anything other than run the scene to see the regression?

EDIT: Ah figured it out: You have to press 'V' to cause it to happen, it does seem to be happening on mine.

I think it's something to do with the tween, it still artifacts even if I comment all the lines out in the shader for if enabled. We've actually never tested tweens with shaders specifically.

There's still far too much going on in the test project though, it is not a minimum reproduction project. I've managed to figure out it is nothing to do with the viewport, and occurs in 2d with none of the 3d stuff needed. But there's still umpteen million scripts going on, so making it quite difficult to investigate.

I suspect it's something to do with the custom shaders and polys, and the tween. The mechanism is probably that the shader input attributes are incorrect, but figuring out what triggers it is more interesting.

Ok I've been gradually getting rid of all the superfluous stuff to make it a minimum reproduction project: BatchingArtifactingEIREXE.zip

Intriguingly it seems to be something to do with a custom style on the Control (SongListItemNormal.tres). When I remove that style, it seems to work ok. I have no idea how the style gets drawn, perhaps it it something in the shader conflicting. It may be that I'll just have to set it to revert to legacy renderer when using a custom style.

lawnjelly commented 3 years ago

Getting there on this, I had to go out this afternoon, but the problem is occurring only when modulate is applied to the control (that has the style) that is non 1, 1, 1, 1 color. There is a different path when using white, because it affects baking colors. I suspect it is changing to large_fvf and this path may not have been fully tested with polys yet.

The style seems to be working by drawing a huge number of polys in the background to the control. Surprisingly large numbers, like 1900 polys for just one control, so this would have been really slow without batching.

Edit : Half working now. It basically boils down to I hadn't yet implemented the extended FVFs supporting custom shaders for polys. So I could have just switched off batching if custom shaders were used, however a lot of the framework is there already to properly support in the same way as with rects, so I'm having a go at doing that in GLES3, then in GLES2. Worst case I can switch off batching for these cases, but I think it should work.

The new primitives open up a bigger matrix of possible use cases compared to just rects, so it will probably be slower covering all the possible issues as they crop up, especially as relatively few people use primitives other than rects in games.

Some cases that are probably broken:

ghost commented 3 years ago

@lawnjelly Maybe not necessary, but here's what I see from my end with your revised MRP.

Win10 GeForce GTX 1660 SUPER/PCIe/SSE2 (GLES2 and GLES3)

ezgif com-gif-maker (1)

lawnjelly commented 3 years ago

Yup it should be the same for everyone, it is not a GPU problem. It's just using the wrong shader for the vertex format because I hadn't implemented the custom shader stuff for polys. I'm nearly done sorting it now though. :+1: :grin:

Is only taking longer because I want to batch it properly rather than just revert to legacy.

lawnjelly commented 3 years ago

The PR above is now in for this poly corruption issue, and solves a bunch of other things too which haven't been noticed yet. I haven't tested yet but I don't think it will solve the texture filtering issue in @EIREXE project, as that is separate problem I think.

While testing it I also uncovered a regression in the 2d software skinning, unrelated. It is possible my fix for the samurai sword project #42576 could have broken something in the other skinning case. I'll try and get this fixed next.

EIREXE commented 3 years ago

Hello again, not sure if this is an issue on my end, but disabling the node TimingArm here makes batching work for the yellow hearts, however enabling it breaks it, is this because the notes all share the same texture as their source (using an atlas) and the timing arms do not? I also tried changing the reordering and lookahead values and that doesn't seem to do anything.

imagen

imagen

lawnjelly commented 3 years ago

Hello again, not sure if this is an issue on my end, but disabling the node TimingArm here makes batching work for the yellow hearts, however enabling it breaks it, is this because the notes all share the same texture as their source (using an atlas) and the timing arms do not? I also tried changing the reordering and lookahead values and that doesn't seem to do anything.

Just to be clear, are you asking about a visual regression, or asking how to improve batching performance in this particular case? (I'm guessing you asking about improving batching).

To see what is happening the key is to use the diagnose mode in the project settings. This will log a list of commands and items every approx 10 seconds to the IDE.

Things that will typically break batching include texture and material changes. Item re-ordering can help with this type of situation, but not in all cases, and it will be limited by the reordering lookahead project setting.

I would probably need an example project to diagnose properly, but a diagnose log may help.

Alternatively, if you can rearrange your items so all the hearts are together and all the swingarms are together that will normally solve this kind of problem. But this is a situation re-ordering can normally help with.

EDIT Ah! I think I've got it. I think they are not reordering because they are failing the overlap check. If you make your hearts further apart, you might find they re-order.

The reason is that it cannot reorder is that if it fails the overlap check, it cannot guarantee that the rendered result will be identical.

See: https://docs.godotengine.org/en/stable/tutorials/optimization/batching.html#a-trick

So In this case, you might either try and use the same texture / material (use an atlas) for the swingarm and heart (in which case hopefully no re-ordering should be necessary), or try having 2 separate lists in the scene tree, so that they are grouped together, and will make it far easier for the batcher.

hearts
- heart
- heart
- heart
swingarms
- swingarm
- swingarm
- swingarm
EIREXE commented 3 years ago

I see, that's the solution I expected then, thank you very much!

oeleo1 commented 3 years ago

I tried 3.2.4-beta3. Great job! Most bugs I saw originally seem to have been addressed. I still have one visual regression which I would like to report here. It's a bit of a convoluted setup, but I believe you will grasp it easily.

This occurs on iOS target only (not on Android, MacOS, nor on Windows) and only in GLES2. Batching is disabled. Here is the minimal project I ended up with sprite_test.zip

This basically is a Sprite with a custom shader swapping between 4 textures. There is a Light2D which is supposed to hide some parts of the screen containing the texture mask. The problem is that what we see the Light2D texture in the Sprite, instead of the 1st sprite texture (out of the 4).

It sould look like this: good

But it looks like this: bad

Removing the light, shows the sprite correctly. GLES3 doesn't exhibit the problem.

NB: Not sure if this is related or not, but I notice that exporting to iOS requires me to check PVRTC compression in the settings. This is wrong, but I will raise a bug about it separately. The project is common to different platforms, it used to work fine in 3.2.3 / iOS without that requirement, so this new requirement is not appropriate.

oeleo1 commented 3 years ago

PS: I seem to remember seeing a commit by @clayjohn related to the number of texture slots or similar, but I just can't tell exactly what it was about. Could it be related? Very strange this occurs on iOS only and none of the other targets we usually work with.

zzz-assault commented 3 years ago

Godot version: 3.2.4 (Beta 3)

OS: Windows 10 CPU: I7 8700 GPU: RTX 2080

Project used for testing is heavily using tilemaps to create a huge map (generated completely at the beginning, just optimized by visibility notifiers) - in the test case map size was 2k x 2k tiles -> both measurements are not 100% comparable due to proc gen content, but after several tests the result is basically not really affected.

I also haven't found any graphic issues so far!

Comparison - stable 3.2.3 (GLES3) vs. beta 3 3.2.4 (GLES3):

image

Btw. with activated Y-Sort the draw calls are still optimized, but the FPS stay identical with and without batching (no improvement) -> I assume in this case bottleneck is Y-Sort logic instead of draw calls.

GREAT JOB lawnjelly!

lawnjelly commented 3 years ago

This occurs on iOS target only (not on Android, MacOS, nor on Windows) and only in GLES2. Batching is disabled. Here is the minimal project I ended up with

If this occurs with batching off it sounds less likely to be a batching bug (so would be better in a separate issue, but I guess we can address here), however it may be a regression introduced by other changes to the renderer. You describe it as a regression, which previous version can you confirm it worked in?

You mention 3.2.3 but I'm not sure whether that is talking about PVRTC or the main problem you describe.

It is possible it is to do with @clayjohn's changes, but also highly likely is that you are simply running out of texture slots on your hardware. Mobile often has a maximum of 8 slots, and many of those are reserved in godot. This may be one for clayjohn in fact.

If you are running out of texture slots two things spring to mind as most obvious culprits:

Also, I see you are using nvidia_workaround. Have you tried it on the hardware both with and without the workaround? That is another possible.

lawnjelly commented 3 years ago

Btw. with activated Y-Sort the draw calls are still optimized, but the FPS stay identical with and without batching (no improvement) -> I assume in this case bottleneck is Y-Sort logic instead of draw calls.

This does sound likely, if the drawcalls are reducing and the FPS isn't increasing it does usually mean the bottleneck is elsewhere... y sort may be expensive and maybe it is done every frame where not required. It is outside the rasterizer and I'm not really familiar with it.

oeleo1 commented 3 years ago

You describe it as a regression, which previous version can you confirm it worked in?

It is a regression from 3.2.3-stable for sure. I just went through the multicolored IDE of beta1 and can confirm that it works fine on the target as well. So this broke since beta2 or 3.

Sorry for reporting this issue here if it is unrelated to batching, but this is mainly due to my ignorance (and I'd be happy to open a separate issue in the bug tracker). I thought you were interested in any visual regressions compared to the legacy. Also, I do want to test this much improved version of batching but need a non-batching performance basis first. Was torn away from the colorful beta1, skipped beta2 as too many residual visual regressions were still in. Now with beta3 I have only 2 visual regressions with batching turned off.

  1. The messed textures I reported above
  2. Sliders which doesn't fill properly. I believe this is a corollary of the ninepatch bug which was fixed after beta3 came out, so I'll wait on this until the next beta to confirm.

Re: PVRTC, yes, it seems totally unrelated.

Re: texture slots -- primary suspect at this point methinks. Note that the sprite & light are under a CanvasLayer. If I move them above the CanvasLayer, the problem goes away. If set the visibility of the light to false, the problem goes away.

oeleo1 commented 3 years ago

I just went through the multicolored IDE of beta1 and can confirm that it works fine on the target as well. So this broke since beta2 or 3.

Booh! I shamelessly lied. This is broken since beta1. Previously did the test with the wrong renderer (GLES3 instead of GLES2) but hey, the colorful IDE is so much fun to navigate and switch renderers. ;-)

lawnjelly commented 3 years ago

Yeah with the other bugs, if they are not batching, then 1 issue per bug is much easier to deal with, so I would encourage you to report them separately. Worst case they might be duplicates.

And also given the iOS nature of the 2d light bug, it is probably worth separating this too as it may require some verbosity to solve... you could just copy paste the info from here? If a canvas layer is involved this is also relevant.

clayjohn commented 3 years ago

@oeleo1 for reference here is the PR I made https://github.com/godotengine/godot/pull/42538.

It leaves some devices with only two custom texture slots instead of 3.

oeleo1 commented 3 years ago

@clayjohn Right, thanks. The PR suggests 3 slots have been a favor for low end devices, but actually this happens on all iOS devices, including latest models (iPhones and iPads). One could argue that Apple/iOS is low end :-) but I am not into this debate. If there is a way out with 2 slots by refactoring stuff, I'll happily embrace it, but I don't see it (yet)...

clayjohn commented 3 years ago

@oeleo1 Fair enough, it looks like all iOS devices are limited to the minimum of 8 texture slots. :/

oeleo1 commented 3 years ago

@lawnjelly @clayjohn Now onto some real feedback on the topic. This batching effort is an exceptional piece of work!

I did some profiling with Apples's Instruments on target hardware (5 year old iPhone SE 1st gen - supposedly a weak model as of this writing) without and with batching on a real action game. Batching was enabled with the default settings. Everything is GLES2.

Without batching, the main game load results in 56-60 FPS with regular dips down to 45 FPS on graphic intensive situations. Some particularly heavy scenes with thousands of (CPU) particles drop down to 25-30 FPS.

With batching, the number of draw calls is reduced by anything between 20% and 50%. The main game load results in steady 59-60 FPS with the graphic intensive situations dropping down to 56 FPS. I didn't see a lower number during the main play. The heavy scenes with thousands of particles drop down to the same amount of 25-30 FPS which is no surprise since particles are out scope.

I rarely engage in superlatives but what you've done here deserves the community's utmost respect. Bravo!

jordi-star commented 3 years ago

I'm getting a Visual Regression when attempting to create a NinePatchRect from gdscript with batching enabled on GLES3.

On Godot 3.2.3, No Batching (GLES3) image ^ This is what it's supposed to look like

On Godot 3.2.4, With Batching (GLES3) image ^ Incorrect. The border of the box is not being scaled correctly.

Is there a fix for this?

anthonyromano commented 3 years ago

Hi

I have a regression with batching. I've attached two screenshots, both running on beta 4, one with batching on one with batching off. On the stable build, batching on or off makes no visual difference. There are no Light2Ds in these screenshots. As you can see, in beta4 (and beta3 also had this bug, I did not test beta 1 and 2) the enemies vision cone fades out when it approaches the player. The player has a screen reading shader around him to create the "light" effect. The enemy vision cones are also screen reading shaders. The issue seems to be when they overlap.

If I add a BackBufferCopy as the first node on the player, it "fixes" the issue. But I assume toggling batching on shouldn't change the visuals of the game or require extra nodes? And it hasn't in the past builds either, so I'm guessing something is going on.

Batching On (Beta4):

Screen Shot 2020-12-12 at 1 55 03 AM

Batching Off (Beta4) (And what batching on and off look like normally)

Screen Shot 2020-12-12 at 1 54 24 AM
lawnjelly commented 3 years ago

I'm getting a Visual Regression when attempting to create a NinePatchRect from gdscript with batching enabled on GLES3.

There are now 2 ninepatch modes, see project_settings->rendering->quality->2d->ninepatch_mode, you may be relying on behaviour from the puthre method (scaling). Try the other setting, and let us know if it fixes for you.

lawnjelly commented 3 years ago

I have a regression with batching. I've attached two screenshots, both running on beta 4, one with batching on one with batching off. On the stable build, batching on or off makes no visual difference. There are no Light2Ds in these screenshots. As you can see, in beta4 (and beta3 also had this bug, I did not test beta 1 and 2) the enemies vision cone fades out when it approaches the player. The player has a screen reading shader around him to create the "light" effect. The enemy vision cones are also screen reading shaders. The issue seems to be when they overlap.

I'll take a look at this when I'm home, but I'll need a minimum reproduction project.

jordi-star commented 3 years ago

I'm getting a Visual Regression when attempting to create a NinePatchRect from gdscript with batching enabled on GLES3.

There are now 2 ninepatch modes, see project_settings->rendering->quality->2d->ninepatch_mode, you may be relying on behaviour from the puthre method (scaling). Try the other setting, and let us know if it fixes for you.

With NinePatch Mode set to default: Screenshot_20201212_110952

With NinePatch Mode set to scaling: Screenshot_20201212_111024

It's still not correct compared to Godot 3.2.3: image

lawnjelly commented 3 years ago

@summertimejordi If you can make a min reproduction project I can take a look. :+1:

jordi-star commented 3 years ago

@summertimejordi If you can make a min reproduction project I can take a look.

I had just finished doing that when you replied :+1:

Here it is: https://drive.google.com/file/d/1xODicuJRJrrSoIInFa4IEugyAJbIW9o-/view?usp=sharing

@lawnjelly

lawnjelly commented 3 years ago

Ok I've done a little investigating. When I ran with flash_batching it shows that the legacy renderer is the same as the batching (in fact in your case it is using the legacy path). I had to remind myself of the changes to ninepatches as this was a couple of months ago.

Here is some relevant discussion: https://github.com/godotengine/godot-proposals/issues/1618 https://github.com/godotengine/godot/pull/42119#issuecomment-704097907 https://github.com/godotengine/godot/pull/32170

TLDR - there was an old method that was used for ninepatches. Puthre made a PR which changed this behaviour, to the behaviour you were using in your example. This change proved controversial - there was a lot of pushback with people saying they preferred the old behaviour.

In unified batching, you now get the choice between using the old old method (default) and the old method (scaling). However due to technical reasons, you only get this choice when not using tiled mode for the ninepatch. The only way to support tiled mode is through the GLES3 shader (it is not supported in GLES2), and it was decided not to be possible / practical to support both methods in the shader. We had to choose one, and at the time @clayjohn and myself got the impression that the popular approach which was the old old approach.

However, all is not lost, in your case, if you use batching, and you change the mode to scaling (in the project setting) and change the tile mode to AXIS_STRETCH_MODE_STRETCH in both axes, then it can use batching and emulate the scaling method. If you don't have to use the tile stretch mode, not only will you get the choice of how you look, but it should render much faster.

Where this won't work is if you have absolutely have to use AXIS_STRETCH_MODE_TILE_FIT, as this can only be done via shader. I don't think you have to in your test case (try it out).

This is just the consequence of us having to make a hard choice between one method and the other in the shader. There is no right answer, either way some people will be happy and some disappointed.

BTW here's a screencap in 3.2.4 using the different stretch modes, and ninepatch mode set to scaling: ninepatch

jordi-star commented 3 years ago

Ok I've done a little investigating. When I ran with flash_batching it shows that the legacy renderer is the same as the batching (in fact in your case it is using the legacy path). I had to remind myself of the changes to ninepatches as this was a couple of months ago.

Here is some relevant discussion: https://github.com/godotengine/godot-proposals/issues/1618 https://github.com/godotengine/godot/pull/42119#issuecomment-704097907 https://github.com/godotengine/godot/pull/32170

TLDR - there was an old method that was used for ninepatches. Puthre made a PR which changed this behaviour, to the behaviour you were using in your example. This change proved controversial - there was a lot of pushback with people saying they preferred the old behaviour.

In unified batching, you now get the choice between using the old old method (default) and the old method (scaling). However due to technical reasons, you only get this choice when not using tiled mode for the ninepatch. The only way to support tiled mode is through the GLES3 shader (it is not supported in GLES2), and it was decided not to be possible / practical to support both methods in the shader. We had to choose one, and at the time @clayjohn and myself got the impression that the popular approach which was the old old approach.

However, all is not lost, in your case, if you use batching, and you change the mode to scaling (in the project setting) and change the tile mode to AXIS_STRETCH_MODE_STRETCH in both axes, then it can use batching and emulate the scaling method. If you don't have to use the tile stretch mode, not only will you get the choice of how you look, but it should render much faster.

Where this won't work is if you have absolutely have to use AXIS_STRETCH_MODE_TILE_FIT, as this can only be done via shader. I don't think you have to in your test case (try it out).

This is just the consequence of us having to make a hard choice between one method and the other in the shader. There is no right answer, either way some people will be happy and some disappointed.

BTW here's a screencap in 3.2.4 using the different stretch modes, and ninepatch mode set to scaling: ninepatch

Oh, I didn't know that. Yeah in my case I would be totally fine with using Stretch, thanks!