Generate optimized code for shaders using a new language

Dawoodoz commented 1 year ago

To both allow previewing models with third party shaders in the model editor and get high performance in the final release of an application, a portable shader language is needed for vertex and texture shaders. The generated code should be so readable that one can modify it by hand in C++, but the option of generating more optimized instrinsic functions for C/C++ as well would make the language reusable outside of the framework.

Texel shaders would process regions of an image, similar to the Halide language, but with built-in functions for sampling light. This would be like a subset of the media machine, but optimized for performance by allowing floating-point operations in virtualization for compound instructions and generated code for full speed. Shading to texture is the technique used on GPUs for subsurface scattering to make skin look more realistic by blurring the shadows after calculating colored glow from the light 's direction.

Pixel shaders could be referred to using names in the materials and then fetch a pre-compiled pixel shader from the graphics engine. A collection of generic built-in pixel shaders can also be added to the core renderer, to extend what can be done with interpreted vertex shaders.

Vertex shaders should work pretty much like on a GPU, but with certain limitations in control flow to make it faster on a CPU.

Interpreted mode for quick prototyping in the 3D editor:

Interpreted with large planar buffers for inteemediate values.
Intrinsic compound functions for common combinations of virtual instructions can be applied because no clamping is performed implicitly when floating-point operations are used.
Work with planar data formats to get zero overhead when mixing channels from different attributes.

Compiled mode:

Calling SIMD.h from automatically generated code.
Work with packed data formats and process one SIMD vector at a time.
The generated C++ code can be saved with the other code, so that one does not depend on getting the shader compiler to work, just the SIMD.h hardware abstraction layer. This makes sure that basic maintenance does not require knowledge about compilers or assembler.

Calling the shader generation should be easy with both external build systems and the library's own build system.

The language syntax should abstract away both vector length and how the data is stored, so that it does not matter if one uses a planar or packed vertex structure.

Conditional if statements should not be allowed, because that would not be data parallel for vectorization. Masking operations should be used instead.

Models For best integration with the Model API, the old format will become the default vertex structure and a new type of pointer similar to SafePointer will contain a padded power-of-two element stride, so that it can access packed, planar and semi-planar data by changing the element stride in the pointer and automatically get the correct element from the [] operand by pre-multiplying the element size with the stride and getting the base two logarithm for bit shifting. If requesting a part's vertex color from a model, you get a pointer to the first FVector4D with a stride to the next element, which will be the same for all vertices in the same packing. One packing is for the final render, so that you don't need to read the other attributes used for generating light. Rarely used data is packed together further back in the allocation. Separate vertex buffers could potentially be used for different types of animation. In the 3D model editor, one can use the same type of pointer for accessing planar buffers from separate allocations, which are currently used to save memory by only cloning the attributes that changed for planar immutable undo history.

Dawoodoz commented 9 months ago

To keep it simple, one could start by assigning vertex, pixel and texel shaders using thread-safe lambda functions. It would however still benefit from having customizable vertex structures, because they can easily be loaded from the model. Then the 3D model editor can get released with good performance before having a shader compiler, by allowing a difference in visual appearance. Just need a few pre-compiled shaders in a generic graphics engine. The code for these lambda functions can then later be transpiled from a shader language and bundled into the specialized graphics engines, without any limitations from scripting.

Vertex shaders would need more than one function to get a good integration. Before transforming individual vertices, transform the bounding shape and let culling and occlusion decide if vertices should be transformed for occlusion shapes and visible geometry.

Dawoodoz commented 1 month ago

Upgrading my compiler interpreter with a backtracking recursive descent parser that is generated automatically from an arbitrary context free grammar when initializing. Then it can handle any CFG where no more than one syntax tree can generate a certain sequence of tokens (non-ambiguous), because it will simply try another interpretation if one was a dead end. Obvious beginner mistakes that are proven to create ambiguity in the grammar will be pointed out during analysis, to track down bugs faster. With some balancing of how much heuristics to use before attempting to parse a sub-expression, worst runtime complexity should be around O(n log n) for easily readable grammars, but O(2 ^ n) if heuristics is not working at all. A alternative pattern of a single sub-expression is treated as a sub-set, so that it does not randomly walk into dead ends and cause redundant back-tracking. If I add fuzzy matching with machine learning, it should be able to select accurate error messages by predicting what the user tried to write. Then it just needs a lot of testing and an API before it can be used for code generation tools.

Dawoodoz commented 1 month ago

Creating a tool for automated migration to newer library versions using pre-written search and replace commands, might be a good way to test the parser before using it for code generation.

Dawoodoz commented 1 week ago

If integrated with the new build system, it is good to keep them as separate applications to allow combining with other build systems and code generators. Starting a new instance of a transpiler for each file would however be quite slow, so the command line interface might have to take all filenames at once, so that language definitions are interpreted once for all modules.

Can let a scripted platform have basic entry points for static analysis (text to messages), code generation (custom format to text), transpilation (text to text using checksums), style formating (changing text and checking that tokenization results are not affected), et cetera. Then the tokenizer and parser is called from the scripted language as a standard library.

Dawoodoz / DFPSR

Generate optimized code for shaders using a new language #80