Unify GPUParticles and CPUParticles

QbieShay commented 5 months ago

Describe the project you are working on

Godotengine

Describe the problem or limitation you are having in your project

https://github.com/godotengine/godot-proposals/issues/7344 of which i proposed a solution https://github.com/godotengine/godot-proposals/issues/7517 which seems to be pretty much unviable.

It's also to note that gpuparticles always setup either a compute pipeline or transform feedbacks. Since many VFX use particle systems with a single particle, there's a lot of resources wasted on it even if we reach peak optimization in the rendering side.

Currently, users have to choose between which Particle simulation to use, either GPU or CPU. Since the GPUParticles update, cpuparticles have fallen behind in terms of supported features and usability. I have been thinking a lot about updating CPUParticles too, but it's effectively 2 separate updates (2d and 3d), meaning a lot of work and a lot of bug surface.

As features grow, the complexity and maintainance cost grows three and even four times as fast.

Describe the feature / enhancement and how it helps to overcome the problem or limitation

Please bear with me and read thoroughly.

I will be using particle to refer to a single particle, and system to refer to an object that emits and manages a collection of particles. I will be using "VFX" as an object that is a collection of systems.

So VFX > System > Particle

There's different usecases for GPUParticles and CPUParticles. There's different feature capability too. I have been pushing back about collision reports from particle system as a general feature, because it cannot really be done reliably on GPUParticles, and I've been wanting for long to freeze CPUParticles.

However,

I think it's possible to create a node that has a GPU/CPU toggle.

It would support features of particles that are possible only on GPU and features of particles that are possible only on CPU. Using either of those features would force either CPU or GPU simulation.

This would be almost transparent to the user (a toggle, maybe with an "auto" mode too, that selects cpu or gpu based on particle count, used features, etc.).

For us, this means finding a way to toggle between CPU and GPU simulation: it means that Godot needs to be able to do this switch.

So considering the following assumptions:

CPUparticles are still needed for performance on mobile devices
CPUparticles are still needed for some feature capability that it's just not possible or reasonable on the GPU
CPUparticles will be used for low particle count systems
GPUparticles, even with all improvements, will still waste resources in case of systems with low particle count

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

Implement a Godot particles shader to gdscript bytecode compiler. Any ParticleShader would gain the ability to be queried for GDScript bytecode. This bytecode will not be exposed to the user. If they want to script particles, they will do so by writing a shader.

In https://github.com/godotengine/godot-proposals/issues/7344 I already mentioned having a glsl to cpp compiler. This is not really reasonable, because it would require everyone to install a compiler in order to have cpuparticles. It would also mean that users would need to compile for all the platform. I don't think i need to go into detail of why this is extremely unwieldy.

However,

Given the previous assumptions, I believe that mimicking the same infrastructure as GPUParticle shader, that provides a couple of built-in functions and built-in variables, is a viable way to maintain cpuparticles long term, considering their limited usecases (see above).

Therefore

A new particle node, that's can either run a black-magic-rendering-server-managed GPU simulation, or can behave the same way as CPUParticles, by leveraging multimeshinstance node and processing transforms and extra data.

There would still be some code duplication, but instead of being on the user facing feature level code, which gets changed very often, it would be only in the infrastructural side, that is a deeper layer and generally requires less changes. In my experience as a particle maintainer, I have touched way more the user-facing feature code than the infrastructure code for particle. I believe some duplication is necessary and unavoidable due to the nature of "we have two ways of doing the same thing" that having both a gpu and cpu simulation entails.

There are challenges with this approach:

Do we try to keep compat with current particles, do we just make a new node, what's simpler. Maybe adding CPU option to GPUParticles would be okay, but then we need to figure out how to turn the node into a multimeshinstance.
I don't know yet the level of structural difference beteen the infrastructure of gpuparticles and cpuparticles. There may be foundamental incompatibilities that I did not foresee when writing this
The monumental amount of work to unify everything
The performance loss for cpuparticles by running on gdscript bytecode instead of raw c++

The bonus

Possibly have a sustainable maitainance cost moving forward. As Godot reaches maturity in various area of rendering, having a daunting codebase for particle necessarily means less features, less improvement, more bugs
Cpuparticles can now be scripted the same way as GPUParticles, even with visual shaders! Which means more artistic control and cooler vfx, especially on mobile
Not leaving behind mobile and less capable platform in term of graphic and artistic capabilities.

I'm aware that this is a compromise and there will be eyebrows raised. However I believe that with all the constraints of available tech, available people effort, and the ultimate goal of providing artists powerful tooling, this is a viable compromise. It will not be the fastest and it will not be the best, but it will cover 80% of needs.

If this enhancement will not be used often, can it be worked around with a few lines of script?

no

Is there a reason why this should be core and not an add-on in the asset library?

no

KoBeWi commented 5 months ago

The main problem with this approach is that particles are core, while GDScript is a module. If someone make a custom build without GDScript, because they e.g. use C#, they won't be able to use CPU particles.

The only thing that you really need is a set of instructions that would draw the particles. It can be even an array of method pointers/callables or whatever; I think a shader can be translated to a list of such instructions, which would then be processed by the particles node.

AThousandShips commented 5 months ago

I think investing a sort of "particle language" to run CPU particles in would be interesting, but it'd most likely be significantly lower performance and take a lot of work to implement, the compiler wouldn't be too difficult you could utilize the existing shader compiler to an extent or maybe use glslang (though that'd mean it would become a full dependency, which it currently isn't, it's only used if you use Vulkan)

I like the idea but I think working more with making GPU particles feasible in more contexts is a more productive approach, identifying why GPU particles struggle and only if the cause of that is insurmountable and unlikely to be resolved in the near future do I think such a massive project should be seriously considered

I think creating some kind of tool to helpfully convert particle shaders to reasonably workable c++ for development purposes would be useful and make it easier to align the two formats, but not for production just for engine development

winston-yallow commented 5 months ago

Another option may be to look for existing projects that can do JIT compilation of GLSL/SPIR-V for the CPU, like this for example: https://github.com/lighttransport/softcompute

I'm not sure if a cross platform solution with a compatible license exists though.

AThousandShips commented 5 months ago

Another option may be to look for existing projects that can do JIT compilation of GLSL/SPIR-V for the CPU

Agreed, I don't feel like CPU particles should support arbitrary shader code, it's just too complex IMO, if we do want that I'd instead suggest investing established emulators for compute code and similar, like OpenCL related things, rather than trying to establish our own code to simulate it

(Spliced my comment here instead, same thought!)

QbieShay commented 5 months ago

Note, I specifically proposed GDScript bytecode because i do not think this should be a general purpose solution. We don't need to turn this into a "be able to compile any godot shader into bytecode to execute on the cpu". This is specifically and only for the subset that processes particles, with the specific problem of "impossible maintainance cost" to solve. I don't think this should be turned into a general purpose tool because it would very much explode the scope.

AThousandShips commented 5 months ago

I'm not sure restricting the scope rather than keeping it general for compilation would be any simpler, or reducing the workload, especially if using a third party library, having a specific subset can be harder and more error prone, and especially cause unforeseen issues with edge cases

QbieShay commented 5 months ago

Third party library require installing a compiler from what I can see, which is specifically something I want to avoid, since shaders for particles are assembled at editor-time. We shouldn't neeed to ship a compiler.

winston-yallow commented 5 months ago

I was only linking that one as an example of what is possible, if we use a third party then it should not have additional runtime dependencies.

winston-yallow commented 5 months ago

But yeah in general I agree that an internal solution would be preferable if it is possible/maintainable scope.

QbieShay commented 5 months ago

Usually third parties come with extra dependencies and a certain extra size on binaries. I'm not against it per-se but cosnidering mobile is the primary cpuparticles target, increasing executable size, by a margin, is not ideal. I'm aware I'm also generally against third party code, so there's a bias there, but i really feel like it's an overengineered solution for the problem at hand

AThousandShips commented 5 months ago

I'd say argue emulating shaders itself is approaching over engineering, and especially if we spin our own solutions, it's reinventing the wheel in a lot of ways

QbieShay commented 5 months ago

I've had conflicting information on how hard this would be. I am not a compilers/language person so i got absolutely no clue, and i can only trust what others say. I do believe that there's merit in rolling tailored, problem specific solutions over trying to adapt a general purpose solution.

As for your initial concern about improving GPUParticles, it is a fact that they always run a compute pass and a second compute shader after. When I proposed to run the same transform/billboard compute pass on every meshinstance in the past to @clayjohn , he said that it would make every mesh run slower. I'm inclined to believe that avoiding the cost of extra gpu compute pipelines is desirable when a cpu solution would run much faster.

QbieShay commented 5 months ago

I believe @Geometror had a benchmark for baseline cost too

AThousandShips commented 5 months ago

So some points for what I mean:

I think the important base is to ensure that the GPU particles aren't struggling due to design decisions and incompatibilities, but on hard limits on hardware capacity, to ensure that we don't have areas in the GPU particle pipeline we can still improve
If we are wanting to support custom particle shaders on the CPU particles, a homebrew emulator is a lot of work in my experience, I might be wrong but I don't expect it to be something easy, I'd be interested in some sources on what others have suggested for that
If we aren't wanting that there's no reason at all to emulate the shaders IMO but have it all be hand crafted, it'd be a very ill suited solution in that case

Shadowblitz16 commented 2 months ago

I would like to see particles as well as their animation be stored in a particle resource. Not sure if it's related but if not I can make a new issue about it

godotengine / godot-proposals