Open ykafia opened 2 years ago
Since every local and global element of code has its own identifier we have to avoid using the same identifiers for different values/types/variables.
It's fairly common for compilers to first do a user facing pass on the AST (mostly regarding type checking) that reports using user declared names, and afterwards user declared names are substituted for some unique (in the compilation unit) generic name to ensure that no two identifiers which are meant to be unique would be mapped to the same name.
Of course in case of global variables that wouldn't necessarily be the case, though it's very important to note here what the behaviour would be if two mixins declare a global variable using different types (would there first be a typechecking pass on the AST for user-friendly errors, and then a secondary bytecode typechecking pass to ensure correctness after mixing?).
To give an info on the current status of my prototype since my days off will end this week :
Most of the parsing is finished, i think there will be some issues on some shaders since i haven't implemented all the rules for how staging, streaming and compositions work.
The preprocessor is also mostly functionning but i didn't yet implement the method definition in define macros (i've got some examples in VL.Stride to help me so i will have to implement it later), but in the meantime i reused Stride's current macro preprocessor based on Clang's implementation. Interestingly, my preprocessor implementation kinda performs as well as Clang's and is simpler to read and enhance.
So far i've implemented a system for creating an AST from the parse tree :
Once i have a decent AST i will try to implement the mixing system without checks to at least make sure the AST can match SPIRV better. I worry there are SDSL features inherited from HLSL that just wouldn't work the same since SPIRV is largely inspired by GLSL.
Of course in case of global variables that wouldn't necessarily be the case, though it's very important to note here what the behaviour would be if two mixins declare a global variable using different types (would there first be a typechecking pass on the AST for user-friendly errors, and then a secondary bytecode typechecking pass to ensure correctness after mixing?).
That's a good question, i tried to generate some spirv bytecode by hand and most of the compilers and tools i used don't handle this issue. I'll try to mess with that first once i get something to work with.
(PS : i'm using mermaid for the graphs)
I've been working on the compiler part of things so here is my current design
Each mixins is parsed separately, we create many AST for them (just like the current compiler). But instead of manipulating ASTs, we create a graph of AST and query them to generate bytecode.
Since each mixins would be containing its own (unchanged) AST and some TAC (three address code), updating a mixin would just need to generate the TAC for that mixin only.
So here's a simplified class diagram for it :
classDiagram
class ShaderMixin
ShaderMixin: +String code
ShaderMixin: +ShaderProgram AST
ShaderMixin: +List~ShaderMixin~ mixins
ShaderMixin: +ShaderByteCode spirvByteCode
class ShaderProgram
ShaderProgram: +List~ConstBufferValues~ cBuffer
ShaderProgram: +List~ShaderVariables~ variables
ShaderProgram: +List~ShaderMethod~ methods
class ShaderMethod
ShaderMethod: +string Name
ShaderMethod: +bool IsStatic
ShaderMethod: +bool IsStream
ShaderMethod: +string returnType
ShaderMethod: +List~Variables~ params
ShaderMethod: +List~Statements~ statements
ShaderMethod: +List~TAC~ threeAddressCode
ShaderMethod: +...
class ShaderByteCode
ShaderByteCode: +byte[] spirv
ShaderByteCode: +string glsl_code
ShaderByteCode: +string hlsl_code
ShaderByteCode: +string msl_code
And the process for caching and updating a shader.
flowchart TB
Start((start)) --> LoadStrideShaders
LoadStrideShaders --> LoadUserShader
LoadUserShader --> CacheShaders
CacheShaders --> ShaderDBEvent(Wait for shader event)
ShaderDBEvent --> IsShaderAdd{Is shader add ?}
IsShaderAdd -->|yes| LoadUserShader
IsShaderAdd -->|no| IsMixinUpdate{Is mixin update ?}
IsMixinUpdate --> QueryAndUpdate[Query and update shader]
QueryAndUpdate --> CacheShaders
And an overview of the process i chose for parsing and compilation.
flowchart TB
subgraph Parsing
StartParsing((start)) --> Load
Load[load shader] --> FirstParse[parse shader]
FirstParse --> HasMixins{has\n mixins ?}
HasMixins -->|yes|LoadOtherShaders[Load and parse other\n mixins recursively]
LoadOtherShaders --> ReorderMixins
ReorderMixins --> EndParsing((End))
HasMixins -->|no|EndParsing((End))
end
subgraph SemanticAnalysis
StartSemanticAnalysis((start)) --> VariableScope
VariableScope --> TypeChecking
TypeChecking --> MethodCall
MethodCall --> EndSemanticAnalysis((end))
end
subgraph TACGen
StartTACGen((start)) --> Generation
Generation --> ExpressionOptimization
ExpressionOptimization --> ConditionalFlowOptimization
ConditionalFlowOptimization --> EndTACGen((end))
end
subgraph SpirvGen
StartSpirvGen((start)) --> ConversionTAC2SpirvRepresentation
ConversionTAC2SpirvRepresentation --> SpirvRepresentation2bytecode
SpirvRepresentation2bytecode --> EndSpirvGen((end))
end
Parsing --> SemanticAnalysis
SemanticAnalysis --> TACGen
TACGen --> SpirvGen
Shader Mixer Rewrite
Follow up on the shader parser discussion from my implementation, on @xen2 proposal, here's a RFC where i will detail an implementation of a SPIRV compiler that we can talk about here.
SPIRV, GLSL but Assembly
Let's start with a short introduction on how spirv works, i'd advise to take a look at this example for reference.
SPIRV modules are made with :
label
and finish with anend
token.Each elements of a spirv code is defined by an identifier (uint32) and an instruction made up of a set of words (uint32).
New Design
Current system
The current shader system does a lot of work to parse and combine shaders together. The main idea is that each SDSL mixin is parsed into an AST, then there are costly operations on those ASTs to mix them together and a translation into an HLSL AST which is then either transfomed into text and compiled with DXC or is converted into GLSL to then be compiled into either spirv or an OpenGL program.
New system
The main idea of the new system would be to compose shaders with spirv bytecode instead of mixing ASTs.
Draft idea
Eacher shader code/mixin will be parsed into it's own AST, and both will be stored in a
ShaderCode
class. Once parsed and the AST will be analyzed by a syntax/semantic analyser to allow for some optimization like constant propagations. After the optimization, theShaderCode
class will generate partial spirv bytecode.The composition will be made in a
ShaderMixer
class containing an array ofShaderCode
objects. Once every mixin is added and the mixing is started, theShaderMixer
will gather eacherShaderCode
byte arrays and merge them together based on SDSL's language rules (i.e. a staged variable won't be changed if already defined in a previous mixin, methods that will be mixed thanks to inheritance will just be concatenated based on the order of apparition etc.)This way, each
ShaderCode
will generate its own bytecode that can be re-used in other mixins without the need of recompiling unless the actual SDSL code has changed. I also assume concatenating byte arrays can be made very efficient.Limitations
While possible there is one core issue for this is due to the elements 4 and 5 of the spirv composition i wrote up there.
Since every local and global element of code has its own identifier we have to avoid using the same identifiers for different values/types/variables.
We can either partition those identifiers or handle them dynamically for each
ShaderMixer
objects.Considerations
Constant buffers, shader reflection, textures ? I am not yet knowledgeable of how Stride binds data to shader code.
What else ?