Blend modes not working on all GPUs and performance issues

Pixelorama version: 1.0

OS/device including version: Multiple devices, will explain below

Issue description: What the title says.

This is mostly intended as a reminder for me to fix these issues before 1.0 is ready, but any contributor is welcome to jump in if they want to.

Issue 1

Layer blend modes, introduced in #911, are not working on all GPUs, and as the result the canvas appears blank (besides the transparency checker background) even if you draw on it. This is because of these uniforms in the BlendLayers shader:

uniform float[1024] opacities;
uniform int[1024] blend_modes;
uniform vec2[1024] origins;

Since shaders cannot have uniform arrays with variable length, we have to specify a constant length. In my desktop GPU, NVIDIA GTX 1060 6GB, the 1024 limit is working as intended. However, on my Android device, Huawei P Smart (GPU: Mali-T830 MP2), this is not working. I didn't do extensive testing for multiple lengths, but lowering the length to 256 on all three uniforms is working.

Potential solution 1

Instead of passing three uniform arrays, construct a Nx3 texture, where N is the current number of layers, and use each pixel for the information we want to store, and pass that as a uniform to the shader. This is the solution I am leaning towards. This should also remove the hard layer limit. My main worry is performance, we would have to construct or update the texture every time the user draws, which could be slow.

Potential solution 2

Somehow check each device's shader uniform limits, and dynamically change the shader on runtime. Not sure if that's even possible to do correctly, so maybe instead we could set the limit to be quite low (maybe to something like 32), and let the user themselves set the number from the preferences. This isn't ideal, as we can't expect every user to troubleshoot which values work and which don't, and it feels like a very hacky solution overall.

Issue 2

The way the layer image data are being passed into the shader, is by looping through all of the layers, getting all of the image data into an array, and using that to construct a Texture2DArray, every single time the user draws. On my computer and on low canvases, I do not notice a performance issue, but it is quite apparent on larger canvases. Even with Godot 4's GDScript performance increases, drawing in 1.x feels slower than in 0.11.x in large canvases.

Solution

Construct the Texture2DArray once, and re-construct it only when the number of layers change, or if the project changes. When drawing, check which layers are being edited, and call update_layer() only on those layers. Even if there's just a single layer, this operation is cheaper than just re-constructing the whole thing over and over.

Extra "solution" if performance issues insist

Worse case scenario, we can use the old layer drawing method when all layer blending modes are set to "Normal", and only use the shader-based version when at least one layer has a different blending mode.

Potential solution for all of the above issues

Maybe using Drawable Textures instead could help, when they get implemented.

Orama-Interactive / Pixelorama