libsdl-org / SDL_shader_tools

Shader compiler and tools for SDLSL (Simple Directmedia Layer Shader Language)

https://libsdl.org/

zlib License

150 stars 3 forks source link

Vertex Shader Syntax Megathread #3

Open icculus opened 2 years ago

icculus commented 2 years ago

I NEED FEEDBACK. IF YOU LOVE OR HATE THIS, PLEASE SPEAK LOUDLY.

IF THIS WILL ABSOLUTELY CAUSE PAIN AND MISERY, SPEAK TWICE AS LOUDLY.

THANK YOU.

Here's a first example of where I'm heading towards. The syntax and the heavy commenting explaining the syntax are what's important; the code itself is not and just meant to be an example.

/*
 * FIRST DRAFT OF SDL SHADER LANGUAGE (or whatever we're calling it) SYNTAX.
 *
 * This was a random HLSL shader generated by Unreal Engine 3, that I trimmed up
 * to remove unused code, etc.
 *
 * Don't worry about what this shader _does_, as it's just one of thousands of
 * variations UE3 generated from whatever and I grabbed one at random.
 *
 *
 * I don't know if this program is correct yet; not only is the grammar still
 * evolving and the compiler still in development, it's possible I deleted
 * a needed function by accident or fat-fingered a simple edit. Just get the
 * idea of what I'm trying to do here.
 *
 * Anything in here might change. Also, if something looks like a
 * short-sighted, dumbass mistake, it probably is. Come talk to me about it.
 *
 * --ryan.
 *
 * icculus@icculus.org
 * https://twitter.com/icculus
 */

// These are constants (uniforms). We put them in a struct, because we'll be
// setting them up in a buffer on the CPU and sending them to GPU memory all
// at once, where the shader will _also_ be treating them as fields in a block
// of memory instead of individual variables. In that sense, there's no value
// in considering them individually.
//
// Since this piece happens to be compatible with C/C++ syntax, and you
// can #include files from there _and_ in these shaders, it's not unreasonable
// that this would go in a header by itself that the shader and native code
// can share (or the shader can have some #ifdefs so you can include it and
// only see this part of the file from C)...this can help make sure they're
// definitely looking at the same data layout.
//
// Alignment and packing on these will be well-defined.
struct FVertexConstants
{
    float4x4 ViewProjectionMatrix;
    float4x4 LocalToWorld;
    float3x3 WorldToLocal;
    float4 CameraPosition;
    float4 ScreenPositionScaleBias;
};

// This is _also_ a constant, but it's one that changes at a different
// rate than the others (one lives per-mesh, one lives per-frame), so
// it will live in a seperate buffer.
struct BoneMatricesConstants
{
    float4x3 BoneMatrices[75];
};

// These are your vertex attributes. You set up your arrays however you like
//  (interleaved in a single buffer, or totally separate buffers, or a little
//  of both), and fill in the details (format, offset, stride, buffer index)
//  when creating a pipeline object. Then assign the buffers with the data to
//  the appropriate indices before drawing, and when the vertex shader
//  runs, it assembles this struct from those sources. You specify
//  this struct as an argument to your main function.
//
// "@attribute(0)" means "this value is retrieved with the details in the
// first entry of SDL_GpuPipelineDescription::vertices array." There are
// several things with "@keyword" in the current language. I call them
// ATtributes. I'm _funny_!
struct FVertexInput
{
    float4 Position @attribute(0);
    half3 TangentX @attribute(1);
    half4 TangentZ @attribute(2);
    int4 BlendIndices @attribute(3);
    float4 BlendWeights @attribute(4);
    float3 DeltaPosition @attribute(5);
    half3 DeltaTangentZ @attribute(6);
};

// This struct holds all the _outputs_ from this shader.
// This struct is passed to the next stage of the pipeline (the fragment
// shader, at this moment). That next shader will need to know what these
// attributes mean and match up, so putting this in a common header
// (or having the matching fragment shader function _in the same source
// file_) is highly recommended.
//
// Other shader languages let you specify what these outputs mean, so
// the fragment shader can pick out the parts it cares about, but I'm
// hoping to avoid that and just pass it all through. If this turns out
// to be a terrible idea, we have @attributes at the ready!
//
// The "@position" means "this is the vertex position set by this shader"
// The system needs to know this beyond your shader pipeline, so there's
// an attribute to let it know where to find it in the output. Everything
// else here is just data that the fragment shader acts on.
struct FVertexOutput
{
    float4 ScreenPosition;
    float4 CameraVector_FogA;
    float4 Position @position;
};

// Heretical to C, functions start with the word "function" ...I am
// gambling that when people say "C-like language" they mean some
// set of characteristics that may include: curly braces, a preprocessor,
// almost no rules about whitespace, the distinctive for-loop syntax,
// and calculate-then-assign operators like +=, *=, etc. But not that it
// exactly resembles C in all ways.
//
// Notably, PHP and Javascript both use the "function" keyword and
// everyone thinks they are "C enough," so I hope this is okay.
//
// The C grammar just expects you to drop right into a datatype and figure
// out what you're talking about several tokens down the line...was it a
// function declaration? A function definition? A variable? A struct that
// might be defining a variable or a new type...?! An identifier, which
// might be a user-declared typedef, making parsing REALLY HARD...?!
//
// Sprinkling some "function" and "var" prefixes on these things helps me,
// the parser writer, immensely, and if we're being honest I bet it helps
// you, the program writer, too.
function float3 MorphPosition(FVertexInput Input)
{
    return Input.Position.xyz + Input.DeltaPosition;
}

// This function uses intrinsic functions, like "normalize" and "cross". Right now
// these match naming conventions in Direct3D, but that may change (GLSL provides
// many of the same functions but disagrees on what to call some of them).
//
// You can also see that matrices can be dereferenced like arrays.
//
// local variables are defined with a "var" keyword, which is also heretical
// to C users, but I couldn't help but notice WebGPU did this too, probably
// because parsing C is hard.
function float3x3 MorphTangents(FVertexInput Input)
{
    var float3x3 Result = 0;
    var float3 TangentX = ((Input.TangentX / 127.5) - 1);
    var float4 TangentZ = ((Input.TangentZ / 127.5) - 1);
    var float3 DeltaTangentZ = ((Input.DeltaTangentZ / 127.5) - 1);
    Result[2] = normalize(float3(TangentZ.x, TangentZ.y, TangentZ.z) + DeltaTangentZ);
    Result[0] = normalize(TangentX - (dot(TangentX, Result[2]) * Result[2]));
    Result[1] = normalize(cross(Result[2], Result[0]) * TangentZ.w);
    return Result;
}

// Note that this calls CalcBoneMatrix which has not been predeclared (it appears
// later in the file). There is no predeclaration; as long as the function is
// in the same compilation unit, the compiler will accept it (parsing accepts all
// function calls, then during semantic analysis, it parses functions definitions
// first to build a full list of predeclarations).
function float3 SkinPosition(FVertexInput Input, BoneMatricesConstants BoneBuffer)
{
    var float4 Position = float4(MorphPosition(Input), 1);
    var float4x3 BoneMatrix = CalcBoneMatrix(Input, BoneBuffer);
    return Position * BoneMatrix;
}

// Notable: all structs are passed by reference (which means if you change
// a field, it will be changed for the caller too). There are no pointers,
// references, or copies made. We should probably offer a way to mark it "const" though...
function float4x3 CalcBoneMatrix(FVertexInput Input, BoneMatricesConstants BoneBuffer)
{
    var float4x3 BoneMatrix = Input.BlendWeights.x * BoneBuffer.BoneMatrices[Input.BlendIndices.x];
    BoneMatrix += Input.BlendWeights.y * BoneBuffer.BoneMatrices[Input.BlendIndices.y];
    BoneMatrix += Input.BlendWeights.z * BoneBuffer.BoneMatrices[Input.BlendIndices.z];
    BoneMatrix += Input.BlendWeights.w * BoneBuffer.BoneMatrices[Input.BlendIndices.w];
    return BoneMatrix;
}

// Vector constructors work like you'd expect. Also, if you haven't noticed, these are
// floatX instead of vecX.
// Matrix and vector multiplication is built-in using standard C operators (even if this
// dithers down to an intrinsic "mul" function in Direct3D, etc).
function float3x3 SkinTangents(FVertexInput Input, BoneMatricesConstants BoneBuffer)
{
    var float3x3 Tangents = MorphTangents(Input);
    var float3x3 TangentBasis;
    var float4x3 BoneMatrix = CalcBoneMatrix(Input, BoneBuffer);
    TangentBasis[0] = float4(Tangents[0], 0) * BoneMatrix;
    TangentBasis[1] = float4(Tangents[1], 0) * BoneMatrix;
    TangentBasis[2] = float4(Tangents[2], 0) * BoneMatrix;
    return TangentBasis;
}

// The main event, literally. This is called once for each vertex. A vertex shader!
//
// You'll note the "@vertex" attribute. General functions can be shared between different shader types, but
// you have to flag the entry point as @vertex so certain magic can occur. Likewise for @fragment entry points.
// One parameter is marked "@inputs," which tells the compiler that this is the magic struct that will be
// assembled from various buffers based on the pipeline details. It changes values for every vertex. "@buffer(1)" means
// "whatever was bound to buffer index 1 with SDL_GpuSetRenderPassVertexBuffer() is available here." It stays
// the same for every vertex until you change it.
// The outputs are returned from the function in a struct.
// Note that you can have more than one shader in a single source file, and can also have vertex _and_
// fragment shaders in the same file; when loading shaders for use, you tell it the name of the entry point
// you want.
// Note that local variable declarations can be interspersed with code.
// Note that assignment is NOT an expression; the `Output.ScreenPosition = Output.Position = X;` works because
// assignment statements allow this as syntactic sugar, but something like `if (X = Y)` or even `if ((X = Y) != 0)` would be illegal.
function @vertex FVertexOutput VertexMain(FVertexInput Input @inputs, FVertexConstants Constants @buffer(0), BoneMatricesConstants BoneBuffer @buffer(1))
{
    var FVertexOutput Output;
    var float4 WorldPosition = Constants.LocalToWorld * float4(SkinPosition(Input, BoneBuffer), 1);
    Output.ScreenPosition = Output.Position = Constants.ViewProjectionMatrix * WorldPosition;
    var float3x3 TangentBasis = SkinTangents(Input, BoneBuffer);
    var float3 WorldVector = Constants.CameraPosition.xyz - WorldPosition.xyz * Constants.CameraPosition.w;
    Output.CameraVector_FogA.xyz = TangentBasis * (Constants.WorldToLocal * WorldVector);
    Output.CameraVector_FogA.w = 0;
    return Output;
}

// end of example_sdlsl_vertex_shader_ue3.sdlsl ...

darksylinc commented 2 years ago

Vertex Input

struct FVertexInput
{
    float4 Position @attribute(0);
    half3 TangentX @attribute(1);
    half4 TangentZ @attribute(2);
    int4 BlendIndices @attribute(3);
    float4 BlendWeights @attribute(4);
    float3 DeltaPosition @attribute(5);
    half3 DeltaTangentZ @attribute(6);
};

I don't have critique. Looks fine. I have only a couple comments:

APIs carry a legacy behavior where all types can be transformed to float. e.g. if I write float4 BlendWeights and I specify RGBA8_UNORM, RGBA8_SNORM, RGBA16_UNORM, RGBA16_SNORM they will all work and reinterpret the integers to either range [0; 1] or [-1; 1]

And if I write RGBA8_UINT, RGBA8_SINT then it will be converted to float in range [0; 255.0] or [-128; 127].

The same with RGBA16_FLOAT (aka half) -> float.

It is valid to e.g. specify RGBA8_UNORM but the shader only says float2 pos (i.e. only R & G channels get used). But it is not valid to specify RG8_UNORM when the shader says float4 pos because there's nothing to source B & A from (there's an old legacy behavior for .w though; where for missing values it gets set to 1.0f).

However the same is not true for the other datatypes. For example if I write int4 BlendIndices, then the only valid formats are exclusively RGBA8_SINT, RGBA16_SINT and RGBA32_SINT.

As for what happens with half3 TangentX (which is like a float3, but it's not a float). It is a mystery. It's not defined, and different drivers behave differently. Mostly because 16-bit half isn't popular yet (but soon, probably).

VS -> PS Signature

struct FVertexOutput
{
    float4 ScreenPosition;
    float4 CameraVector_FogA;
    float4 Position @position;
};

This is the correct way. However I shall add a few notes:

IIRC Vulkan requires the struct to match exactly.
D3D11 / HLSL 5.0 (dunno about D3D12 / HLSL 5.1 / 6.x) allows for partial matching.

e.g. if the VS outputs this:

struct FVertexOutput
{
    float4 Position @position; // Note I moved position first! This is important
    float4 ScreenPosition;
    float4 CameraVector_FogA;
};

and PS inputs this:

struct FVertexOutput
{
    float4 Position @position; // Note I moved position first! This is important
    float4 ScreenPosition;
    // Missing CameraVector_FogA
};

Then the signature between VS and PS is still considered valid (i.e. as long as the same first N used bytes match exactly in name, type and size it's ok).

Syntax prefixes

// Heretical to C, functions start with the word "function" ...

// local variables are defined with a "var" keyword, which is also heretical
// to C users, but I couldn't help but notice WebGPU did this too, probably
// because parsing C is hard.

Ahh, I understand making the parser easier.

But you may find adoption problems here. When porting existing engines to new APIs, these differences can be a mayor PITA.

Adding function may be an inconvenience. But adding var to every single variable can be a significant effort.

Some engines have significantly large Compute codebases.

Anecdotically Apple's Metal had something similar. Functions could either be:

void fooA(); // Must be forward declared first
void fooA() {}

inline void fooB() {} // If not forward declared, then it must be prepended with inline

And ended up bending the knee here allowing void fooB() {} without inline nor forward declarations in a subsequent Metal lang revision.

I don't know the reasons behind it though.

Buffer type declaration

function @vertex FVertexOutput VertexMain(FVertexInput Input @inputs, FVertexConstants Constants @buffer(0), BoneMatricesConstants BoneBuffer @buffer(1))

@buffer is too broad here. Metal is a great example here. There's constant and device memory. The former are UBOs in Vulkan terms. The latter are either read-only SSBOs in Vk terms.

Vulkan also calls TBOs a "buffer" even though its type cannot be described with C-like code.

As a suggestion:

// Option A
function @vertex FVertexOutput VertexMain(
    FVertexInput Input @inputs,
     FVertexConstants Constants @buffer(0, constant),
     BoneMatricesConstants BoneBuffer @buffer(1, device))

// Option B (basically Metal)
function @vertex FVertexOutput VertexMain(
    FVertexInput Input @inputs,
     constant FVertexConstants Constants @buffer(0),
     device BoneMatricesConstants BoneBuffer @buffer(1))

// Option C
function @vertex FVertexOutput VertexMain(
    FVertexInput Input @inputs,
     FVertexConstants Constants @buffer_constant(0),
     BoneMatricesConstants BoneBuffer @buffer_device(1))

Note that buffer_constant slots should not overlap buffer_device.

Another issue to consider is that alignment constraints of FVertexConstants is not the same depending on whether it's a constant or a device function.

Consider this:

function @vertex FVertexOutput VertexMain(
    FVertexInput Input @inputs,
     FVertexConstants Constants @buffer(0, constant),
     BoneMatricesConstants BoneBuffer @buffer(1, device))
{
   doSomething( Constants, BoneBuffer );
}

void doSomething( FVertexConstants Constants, BoneMatricesConstants BoneBuffer )
{
    // Ooops!!! Is FVertexConstants a constant or device???
    // This is important. Without it we cannot compile doSomething
    // because we don't know the alignment rules inside Constants & BoneBuffer
}

// Metal solves it like this:
void doSomething( constant FVertexConstants Constants, device BoneMatricesConstants BoneBuffer )
{
    // Now we know their alignments. FVertexConstants must come from a constant buffer
    // and BoneBuffer from a device buffer
}

michael-nischt commented 2 years ago

Not really a hard pain point and more personal preference but I don't like var. Makes reading code harder. Maybe can be an optional feature. That way I can at least have own code without.

matias-eduardo commented 2 years ago

I've got an idea, but not sure if it's good.

This syntax is SO close to being C compatible that I would just drop the @ for ATtributes and use a prefix instead, e.g. AT_*. It could also be an additional optional syntax if you want to keep the @. This way you could just put this in a .c file and do something like...

#if SOFTWARE_RENDERER
#define function
#define var
#define AT_attribute(n)
#define AT_buffer(n)
// etc.
#endif

// INSERT SHADER STRUCTS

#if SOFTWARE_RENDERER
typedef struct FVertexConstants FVertexConstants;
typedef struct BoneMatricesConstants BoneMatricesConstants;
// etc.
#endif

// INSERT SHADER FUNCTIONS

...to remove shader language-specific syntax. Later on, you could then fill in the blanks to add a custom software renderer and reuse this code.

EDIT: +1 on using $ instead as suggested by @harryisgamer below.

darkerbit commented 2 years ago

Is var really required for parsing, inside a statement block wouldn't just a straight type be unambiguous?

harryisgamer commented 2 years ago

I think I'm neutral about your usage of function and var. Have C-like syntax after them feels a bit weird to me though. Most languages I've seen that use function and var use it so they can have syntax like this (a la Rust):

function float3 MorphPosition(FVertexInput Input)
{
}
// becomes
function MorphPosition(Input: FVertexInput) -> float3
{
}

var float3x3 Tangents = MorphTangents(Input);
// becomes
var Tangents: float3x3 = MorphTangents(Input);
// or, with type inference
var Tangents = MorphTangents(Input);

But honestly for shaders I think I prefer C-like syntax, and making it easier to port shaders over is always a plus. WGSL has many syntax changes, and I don't really like a lot of them for shaders. A lot of people who write shaders are used to GLSL/HLSL. If you end up keeping var, maybe add optional type inference (as shown above).

If you want to use a special symbol for attributes (which I personally like), I would suggest using the $ symbol instead of the @ symbol, as dollar signs are allowed as identifiers in C/C++ in MSVC, GCC, and clang (with GNU extensions enabled). This would allow you to define macros and remove the attributes in most C compilers, while also having nicer syntax than AT_<attribute> (as suggested above).

float4 Position @position;
// becomes
float4 Position $position;

Overall, I like a lot of the shader-specific feature considerations, but I think the syntax could use some improvements.

icculus commented 2 years ago

This is all really great feedback, thank you everyone!

Replies to several people here:

@buffer is too broad here.

I was afraid you were going to say that. :/ I was hoping to avoid the distinction altogether, at least for a first release, but your suggestions are reasonable...I'll look into incorporating them.

Mostly because 16-bit half isn't popular yet (but soon, probably).

I've been on the fence about half datatypes, because it's basically opaque data in an app's native code, since there's no native half type in C (or in...anything else?).

It's easy to add later if needed, or we can add it now--as long as it doesn't need to appear in a constants buffer that's visible to native code, we can just have the real shader treat it as float behind the scenes if worse comes to worst.

IIRC Vulkan requires the struct to match exactly.

I think we'll probably call it undefined behavior if they don't match, because I suspect I won't have enough control at the low level APIs to dictate anything else.

The other option is that we flood this struct with @attributes too, and behind the scenes we assemble the actual struct the next stage needs from it:

struct FVertexOutput
{
    float4 position @position;
    float4 color @attribute(0);
    float4 texcoord @attribute(1);
};

struct FFragmentInput
{
    float4 texcoord @attribute(1);  // note that this doesn't have to be in the same order!
    float4 color @attribute(0);
};

...but my suspicion is this is adding more work to shader developers and me, and more risk of bugs, just to let the fragment shader ignore the position field.

Adding function may be an inconvenience. But adding var to every single variable can be a significant effort.

Seems to be more pushback on "var" than "function" so far. See further notes in replies below, though.

I will say that none of these APIs (except maybe, vaguely, WebGPU?) is promising extreme portability as an end goal, even if Vulkan happens to run on lots of discrete platforms. Which is to say it's not unreasonable to assume, if this all works out (a significant "IF," granted), that new games might target this language and not migrate shaders to anything else. Maybe a fool's dream, though. But what is certain: in current times, you have to migrate to several shader languages if you want to port games, so I doubt a project of any complexity does it by hand...they either generate shaders automatically for a target platform, or automatically transpile from one language that humans develop in to everything else. In that sense, some of the language specifics don't matter that much. And if they go straight to bytecode, they matter even less.

This syntax is SO close to being C compatible that I would just drop the @ for ATtributes and use a prefix instead, e.g. AT_*

I'll probably add a guaranteed preprocessor define (like __SDLSL__ or something), so code can do this:

#ifdef __SDLSL__
#define AT_BUFFER(x) @buffer(x)
#else
#define AT_BUFFER(x)
#endif

But that being said: the part that would need to be seen by C, the constants struct, has no @attributes. And outside of that one small piece, being parsed by a C compiler is not a goal.

Is var really required for parsing, inside a statement block wouldn't just a straight type be unambiguous?

We might be able to get away with it here, because I'm cutting out a lot of C stuff that no one actually cares about but makes everything ambiguous as heck, in which case the "var" ends up being purely defensive on my part and could be dropped. I'll know more as I get further into this. No promises yet.

(but again: WebGPU shaders use it, so you can never escape it fully in the real world.)

darksylinc commented 2 years ago

The other option is that we flood this struct with @attributes too, and behind the scenes we assemble the actual struct the next stage needs from it:

IMO that's unnecessary. It just adds more work for everyone (you and the API users) for little gain.

But what is certain: in current times, you have to migrate to several shader languages if you want to port games, so I doubt a project of any complexity does it by hand...they either generate shaders automatically for a target platform, or automatically transpile from one language that humans develop in to everything else. In that sense, some of the language specifics don't matter that much.

Yes. But there's one catch: one approach is to use semi-automation; thanks to HLSL/GLSL/Metal being so close to each other that it allows to share most of code, while abstracting the differences away with a few macros (e.g. GLSL can use #define float2 vec2 and so on; all the texture() fetching functions need macros).

The main difference is often the entry function signature and buffer/texture declaration.

However if you add "var" you stray too much from HLSL/GLSL/Metal. One could do #define var and add var to every local variable; but like I said it's a lot of code.

What I'm saying is that you either keep close to HLSL/GLSL/Metal syntax, or if you just want to introduce breaking stuff (like var) then you can just take even bigger liberties about the language.

I was afraid you were going to say that. :/ I was hoping to avoid the distinction altogether, at least for a first release, but your suggestions are reasonable...I'll look into incorporating them.

Btw, speaking of alignment std140 (i.e. UBO) rules can be improved upon. For example according to GLSL the following:

struct MyStruct
{
    vec3 a;
    float b;
};

Corresponds to the following C++:

struct MyStruct
{
    float4 a; // 4 not 3!
    float b;
};

b could easily be part of a.w but it is not due to alignment.

The 16-byte stride only makes sense when indexing variables:

struct MyStruct
{
    float3 a[2];
    float b;
};

a[idx] // -> idx is *not* known at compile time

Because some hardware can only do that if the float3s are 16-byte in stride.

Metal has an elegant solution where they introduced packed_float3. This datatype has a few limitations float3 doesn't have, but allows better packing of structs.

I guess we could go even further than that (theoretical syntax, doesn't exist yet):

struct MyStruct
{
    {
       float3 a;
       float b;
    }[2];
};

// idx may or may not be known at compile time
float localA = a[idx].xyz;
float localB = b[idx];

Which would be the same as the following valid syntax:

struct MyStruct
{
    float4 a_b[2];
};

float localA = a_b[idx].xyz;
float localB = a_b[idx].w;

With the current rules and syntax, we're often either forced to have a lot of wasted padding; or forced to manually pack/unpack everything into float4 which hurts readability.

kg commented 2 years ago

Generally looks good, a few thoughts:

// These are constants (uniforms). We put them in a struct, because we'll be // setting them up in a buffer on the CPU and sending them to GPU memory all // at once, where the shader will also be treating them as fields in a block // of memory instead of individual variables. In that sense, there's no value // in considering them individually.

In current fxc+mojoshader FNA world, I've had trouble with this approach because bundling the uniforms up makes it harder to optimize out unused ones. Is that likely to be any sort of a problem here? I'm guessing you can build everything such that you can optimize it since you're doing everything from scratch.

The convenience of uniform structs is great, though, so I don't mind it being the convention.

// Notable: all structs are passed by reference (which means if you change // a field, it will be changed for the caller too). There are no pointers, // references, or copies made. We should probably offer a way to mark it "const" though...

They should be const by default (fail on mutation) unless you explicitly pass them by-ref. It's too much of a footgun otherwise and it will result in 'best practices' of not using struct parameters.

What happens if I pass one of the global uniform structs as a parameter?

What are vertex textures going to look like?

bruxisma commented 2 years ago

I'd like to briefly make an argument against the @keyword (as much as I am a fan of the @tribute pun), and point out that C23 will have standardized the [[attribute]] syntax found in C++11 and later. This would allow solving a few problems here other folks have mentioned (e.g., instead of a packed_float3, one can have [[glsl::std140]] float3, as well as replace the @attribute(n) with [[sdl::index(n), spirv::layout(...)]], or however it ends up turning out in the end . This would also technically make it possible to write tooling that generates valid C, C++, and shader code, as well as use pre-existing tooling like Clang for additional tooling outside of having to expose the parser for general usage, which I don't believe is a goal here.

Not suggesting to adopt C++ behavior here (though I could see some value in , just mentioning that C++ folks can work around any attributes and even remove them from declarations via metaprogramming and we can just "deal with it", C preprocessors can #define around things, etc.

OpenCL adopted __attribute__, but the C++ OpenCL language permits the use of [[...]]. Additionally [[...]] is fairly easy to parse given that none of the existing shading languages out there support [[ as a valid token that will have some meaning (to my knowledge).

That said, going with @ (or $ as mentioned above) should in theory be fine, as @, $ and the grave accent have been approved for addition to the source and execution character set in C23, so C preprocessors in the (near) future shouldn't barf on textual macros containing these characters.

icculus commented 2 years ago

In current fxc+mojoshader FNA world, I've had trouble with this approach because bundling the uniforms up makes it harder to optimize out unused ones.

I can't speak to what FNA does, but MojoShader (which deals with D3D9 level shaders) takes enormous efforts to bundle individual uniforms into a single array behind the scenes, eliminating ones that aren't actually used in the shader when bundling this up.

In that sense, it would just be a buffer of bytes eventually destined for a struct instead of an array...or, more likely in MojoShader's worldview, you'd end up with struct ThisShadersUniforms { float4 all_the_arguments[15]; }; and a biblical level of macros in the generated code. :)

But that might not mean anything at the FNA level, so let's talk about this more if that's going to be an issue.

They should be const by default (fail on mutation) unless you explicitly pass them by-ref. It's too much of a footgun otherwise and it will result in 'best practices' of not using struct parameters.

Okay, that's reasonable. And passing a const thing by-ref will be a compiler error. Constness can't be casted away (there are no casts at all, actually, at the moment...if you need something un-consted, copy it.)

What happens if I pass one of the global uniform structs as a parameter?

Compiler error if you try to pass it by-ref, because they're const by default.

What are vertex textures going to look like?

Much like buffers (whatever we eventually end up with, this example is just based on the code at the top of this thread)...

// The thing associated with @sampler(0) is bound with a call to SDL_GpuSetRenderPassVertexSampler().
function @vertex FVertexOutput VertexMain(FVertexInput Input @inputs, FVertexConstants Constants @buffer(0), Sampler MySampler @sampler(0)) {
    // sample top left from the texture:
    float4 texel = sample(MySampler, float2(0, 0));
}

None of that is solid yet, including the intrinsic function name "sample" and the words "top left". I was saving this for when I get to fragment shaders. Modern APIs do let you use samplers in the vertex shader though (UE3 did not because D3D9 and OpenGL2 did not, hence it missing from this example), so whatever gets finalized there will apply here.

icculus commented 2 years ago

you'd end up with struct ThisShadersUniforms { float4 all_the_arguments[15]; };

Alternately, we might just let you assign arrays to buffers directly, I haven't really thought it through.

(It would be extremely bad practice to have a separate buffer for each uniform, though, so I'm inclined to forbid any arbitrary datatype from being assigned in this way, as someone rolling in from an older OpenGL or Direct3D will certainly be tempted to emulate individual uniforms as such.)

darksylinc commented 2 years ago

(It would be extremely bad practice to have a separate buffer for each uniform, though, so I'm inclined to forbid any arbitrary datatype from being assigned in this way, as someone rolling in from an older OpenGL or Direct3D will certainly be tempted to emulate individual uniforms as such.)

Metal does allow the following though:

constant float &myParam [[buffer(0)]],
constant float *myArray [[buffer(1)]]

And it has been extremely useful when one needs to pass a single parameter (usually a per-draw parameter).

It would be extremely bad practice to have a separate buffer for each uniform

True. But there's usually only 8-15 max slots available. If that doesn't ring the user a red alarm, I'm afraid that user will do a lot of worse mistakes than that.

kg commented 2 years ago

In current fxc+mojoshader FNA world, I've had trouble with this approach because bundling the uniforms up makes it harder to optimize out unused ones.

I can't speak to what FNA does, but MojoShader (which deals with D3D9 level shaders) takes enormous efforts to bundle individual uniforms into a single array behind the scenes, eliminating ones that aren't actually used in the shader when bundling this up.

In that sense, it would just be a buffer of bytes eventually destined for a struct instead of an array...or, more likely in MojoShader's worldview, you'd end up with struct ThisShadersUniforms { float4 all_the_arguments[15]; }; and a biblical level of macros in the generated code. :)

But that might not mean anything at the FNA level, so let's talk about this more if that's going to be an issue.

Sounds totally reasonable. On that note, I'd strongly advise exposing some basic metadata out of the compiler - for example, being able to verify that the uniform layout matches what your code thinks it is can be really valuable. For one console title I worked on we pulled basic layout info out of the shader and then matched it against the struct layout at startup and that caught lots of alignment/packing issues along with cases where people forgot to update the C struct and the shader uniform buffers in sync with each other. SDL_GPU doesn't need to do those checks, but it would be good to have a way to implement them as an end-user.

Can imagine uniform metadata also being useful for GUI tweaker tools.

icculus commented 2 years ago

This is a first shot at a Lemon grammar for this language, with all the glue code missing.

This is untested, and matches the original proposal; it doesn't have things like device/constant memory yet or a by-ref flag of some sort, etc.

(Also, this grammar is currently unambiguous with or without the FUNCTION and VAR keywords, because I threw out a bunch of C flexibility things that people can live without, and C details that aren't relevant, and a few C footguns too, but under the assumption that Lemon doesn't actually get grumpy about conflicts until you actually have everything wired up, they continue to exist. Honestly, I sort of like them, but I get the concerns too, so we'll see how that shakes out.)

Unless you like looking at formal grammars for programming languages, you can skip the rest of this post, it won't add anything new to the conversation, I think.

The language also offers a preprocessor that (as closely as possible) matches C's, which has only been briefly mentioned here so far. The preprocessor code is already written, but that part of the pipeline happens outside of parsing, so you won't see #ifdef and such in the grammar here, even though it's available to the shader.

// operator precedence (matches C spec)...
%left COMMA.
%right ASSIGN ADDASSIGN SUBASSIGN MULASSIGN DIVASSIGN MODASSIGN LSHIFTASSIGN
       RSHIFTASSIGN ANDASSIGN ORASSIGN XORASSIGN.
%right QUESTION.
%left OROR.
%left ANDAND.
%left OR.
%left XOR.
%left AND.
%left EQL NEQ.
%left LT LEQ GT GEQ.
%left LSHIFT RSHIFT.
%left PLUS MINUS.
%left STAR SLASH PERCENT.
%right TYPECAST EXCLAMATION COMPLEMENT MINUSMINUS PLUSPLUS.
%left DOT LBRACKET RBRACKET LPAREN RPAREN.

// bump up the precedence of ELSE, to avoid shift/reduce conflict on the
//  usual "dangling else ambiguity" ...
%right ELSE.

// The rules...

// start here.
%type shader { SDL_SHADER_AstShader * }
%destructor shader { delete_shader(ctx, $$); }
shader(A) ::= translation_unit_list(B). { A = new_shader(ctx, B); }

%type translation_unit_list { SDL_SHADER_AstTranslationUnit * }
%destructor translation_unit_list { delete_translation_unit(ctx, $$); }
translation_unit_list(A) ::= translation_unit(B). { A = B; }
translation_unit_list(A) ::= translation_unit_list(B) translation_unit(C). { B->next = C; A = B; }

// At the top level of the shader, it's only struct declarations and
// functions at the moment. This will likely expand to other things.
translation_unit(A) ::= struct_declaration(B).
translation_unit(A) ::= function(B).

at_attrib(A) ::= AT IDENTIFIER.
at_attrib(A) ::= AT IDENTIFIER LPAREN INT_CONSTANT RPAREN.

struct_declaration(A) ::= STRUCT IDENTIFIER(B) LBRACE struct_member_list(C) RBRACE SEMICOLON.

struct_member_list(A) ::= struct_member(B).
struct_member_list(A) ::= struct_member_list(B) struct_member(C).

// the first identifier is a datatype, but it might be a user-defined struct. To simplify the
// grammar, we don't treat the many built-in types as unique tokens or have a USERTYPE token,
// and let semantic analysis sort it out.
struct_member(A) ::= IDENTIFIER(A) IDENTIFIER(B) SEMICOLON.
struct_member(A) ::= IDENTIFIER(A) IDENTIFIER(B) at_attrib(C) SEMICOLON.
struct_member(A) ::= IDENTIFIER(A) IDENTIFIER(B) LBRACKET INT_CONSTANT RBRACKET SEMICOLON.
struct_member(A) ::= IDENTIFIER(A) IDENTIFIER(B) LBRACKET INT_CONSTANT RBRACKET at_attrib(C) SEMICOLON.

// the original proposal had "function @vertex RetType FunctionName" but everything else
// has @attributes at the end.
function(A) ::= FUNCTION return_type(B) IDENTIFIER(C) function_params(D) statement_block(E).
function(A) ::= FUNCTION return_type(B) IDENTIFIER(C) function_params(D) at_attrib(E) statement_block(F).

return_type(A) ::= VOID.
return_type(A) ::= IDENTIFIER(B).  // let semantic analysis figure it out.

function_params(A) ::= LPAREN RPAREN.
function_params(A) ::= LPAREN VOID RPAREN.
function_params(A) ::= LPAREN function_param_list(B) RPAREN.

function_param_list(A) ::= function_param(B).
function_param_list(A) ::= function_param_list(B) COMMA function_param(C).

// the first identifier is a datatype, but it might be a user-defined struct. To simplify the
// grammar, we don't treat the many built-in types as unique tokens or have a USERTYPE token,
// and let semantic analysis sort it out.
function_param(A) ::= IDENTIFIER(A) IDENTIFIER(B).
function_param(A) ::= IDENTIFIER(A) IDENTIFIER(B) at_attrib(C).

statement_block(A) ::= LBRACE statement_list(B) RBRACE.

statement_list(A) ::= statement(B).
statement_list(A) ::= statement_list(B) statement(C).

statement(A) ::= BREAK SEMICOLON.
statement(A) ::= CONTINUE SEMICOLON.
statement(A) ::= DISCARD SEMICOLON.  // obviously only valid in fragment shaders; semantic analysis will check that.
statement(A) ::= var_declaration(B) SEMICOLON.
statement(A) ::= DO statement(C) WHILE LPAREN expression(D) RPAREN SEMICOLON.
statement(A) ::= WHILE LPAREN expression(C) RPAREN statement(D).
statement(A) ::= FOR LPAREN for_details(B) RPAREN statement(C).
statement(A) ::= IF LPAREN expression(C) RPAREN statement(D).
statement(A) ::= IF LPAREN expression(C) RPAREN statement(D) ELSE statement(E).
statement(A) ::= SWITCH LPAREN expression(C) RPAREN LBRACE switch_case_list(D) RBRACE.
statement(A) ::= SEMICOLON.
// NO EXPRESSIONS AS STANDALONE STATEMENTS! statement(A) ::= expression(B) SEMICOLON.
statement(A) ::= RETURN SEMICOLON.
statement(A) ::= RETURN expression(B) SEMICOLON.
statement(A) ::= assignment_statement(B) SEMICOLON.
statement(A) ::= statement_block(B).
//statement(A) ::= error SEMICOLON. { A = NULL; }  // !!! FIXME: research using the error nonterminal

// assignment is a statement, not an expression (although we tapdance to make this work with a C-like for loop syntax),
// which solves a nasty class of bugs in C programs for not much loss in power.
// We allow multiple assignments for JUST the '=' operator, as syntactic sugar without it being a list of expressions.
assignment_statement(A) ::= lvalue(B) ASSIGN assignment_statement_list(C) expression(D).
assignment_statement(A) ::= lvalue(B) calc_then_assign_operator(C) expression(D).

assignment_statement_list(A) ::= lvalue(B) ASSIGN.
assignment_statement_list(A) ::= assignment_statement_list(B) lvalue(C) ASSIGN.

calc_then_assign_operator(A) ::= PLUSASSIGN|MINUSASSIGN|STARASSIGN|SLASHASSIGN|PERCENTASSIGN|LSHIFTASSIGN|RSHIFTASSIGN|ANDASSIGN|ORASSIGN|XORASSIGN(B). { A = @B; }

for_details(A) ::= for_initializer(B) SEMICOLON expression(C) SEMICOLON expression(D).
for_details(A) ::= for_initializer(B) SEMICOLON expression(C) SEMICOLON.
for_details(A) ::= for_initializer(B) SEMICOLON SEMICOLON expression(D).
for_details(A) ::= for_initializer(B) SEMICOLON SEMICOLON.

for_initializer(A) ::= expression(B).
for_initializer(A) ::= var_declaration(B).
for_initializer(A) ::= assignment_statement(B).
for_initializer(A) ::= .

switch_case_list(A) ::= switch_case(B). { A = B; }
switch_case_list(A) ::= switch_case_list(B) switch_case(C). { A = C; A->next = B; }

// You can do math here, as long as it produces an int constant.
//  ...so "case 3+2:" works.
%type switch_case { SDL_SHADER_AstSwitchCases * }
%destructor switch_case { delete_switch_case(ctx, $$); }
switch_case(A) ::= CASE expression(B) COLON statement_list(C). { REVERSE_LINKED_LIST(SDL_SHADER_AstStatement, C); A = new_switch_case(ctx, B, C); }
switch_case(A) ::= CASE expression(B) COLON. { A = new_switch_case(ctx, B, NULL); }
switch_case(A) ::= DEFAULT COLON statement_list(B). { REVERSE_LINKED_LIST(SDL_SHADER_AstStatement, B); A = new_switch_case(ctx, NULL, B); }
switch_case(A) ::= DEFAULT COLON. { A = new_switch_case(ctx, NULL, NULL); }

// the first identifier is a datatype, but it might be a user-defined struct. To simplify the
// grammar, we don't treat the many built-in types as unique tokens or have a USERTYPE token,
// and let semantic analysis sort it out.
var_declaration(A) ::= VAR IDENTIFIER(B) IDENTIFIER(C).
var_declaration(A) ::= VAR IDENTIFIER(B) IDENTIFIER(C) ASSIGN expression(D).

lvalue(A) ::= IDENTIFIER(B).
lvalue(A) ::= lvalue(A) DOT lvalue(B).
lvalue(A) ::= lvalue(A) LBRACKET expression(C) RBRACKET.

arguments(A) ::= LPAREN RPAREN. { A = NULL; }
arguments(A) ::= LPAREN argument_list(B) RPAREN.

argument_list(A) ::= expression(B).
argument_list(A) ::= argument_list(B) COMMA expression(C).

// here we go.
expression(A) ::= lvalue(B).
expression(A) ::= INT_CONSTANT(B).
expression(A) ::= FLOAT_CONSTANT(B).
expression(A) ::= TRUE.
expression(A) ::= FALSE.
expression(A) ::= LPAREN expression(B) RPAREN.
expression(A) ::= IDENTIFIER(B) arguments(C).  // this might be a function call or datatype constructor; semantic analysis will figure that out!
expression(A) ::= expression(B) PLUSPLUS.
expression(A) ::= expression(B) MINUSMINUS.
expression(A) ::= PLUSPLUS expression(B).
expression(A) ::= MINUSMINUS expression(B).
expression(A) ::= PLUS expression(B).
expression(A) ::= MINUS expression(B).
expression(A) ::= COMPLEMENT expression(B).
expression(A) ::= EXCLAMATION expression(B).
expression(A) ::= expression(B) STAR expression(C).
expression(A) ::= expression(B) SLASH expression(C).
expression(A) ::= expression(B) PERCENT expression(C).
expression(A) ::= expression(B) PLUS expression(C).
expression(A) ::= expression(B) MINUS expression(C).
expression(A) ::= expression(B) LSHIFT expression(C).
expression(A) ::= expression(B) RSHIFT expression(C).
expression(A) ::= expression(B) LT expression(C).
expression(A) ::= expression(B) GT expression(C).
expression(A) ::= expression(B) LEQ expression(C).
expression(A) ::= expression(B) GEQ expression(C).
expression(A) ::= expression(B) EQL expression(C).
expression(A) ::= expression(B) NEQ expression(C).
expression(A) ::= expression(B) AND expression(C).
expression(A) ::= expression(B) XOR expression(C).
expression(A) ::= expression(B) OR expression(C).
expression(A) ::= expression(B) ANDAND expression(C).
expression(A) ::= expression(B) OROR expression(C).
expression(A) ::= expression(B) QUESTION expression(C) COLON expression(D).

/* end of SDL_shader_parser.lemon ... */

Hugobros3 commented 2 years ago

Hi, I have been linked to this on Discord and I feel the need to chime in on this:

// Notable: all structs are passed by reference (which means if you change // a field, it will be changed for the caller too). There are no pointers, // references, or copies made. We should probably offer a way to mark it "const" though...

As others have pointed out, this is a big footgun in terms of usability. It corresponds to only allowing inout parameters in GLSL, which seems to me like a very strange decision. I think this is a bad idea as well, and I hope you change this to default to the standard GLSL/HLSL behaviour, or even better, immutable function parameters.

However, I also want to add that I find this decision suspect: I read between the lines (ie I might be completely wrong about that, and I hope you don't mind my writeup) that this is a decision made "for performance", and if that's indeed the case I would like to dispel some commonplace myths on calling conventions, SPIR-V and shading languages in general (they all have similar semantics, hence why something like SPIRV-Cross can exist).

The idea of passing by reference/value has to do with calling conventions:

On a conventional machine, you may send functions data through registers, or a "stack" (ie just regular main memory)
Passing by value means that the argument receives a copied version of the parameters it was called with, and by changing those copies it may not change their original versions.
Passing by value doesn't mean you have to use the registers, it's a perfectly valid to create an extra copy of the object on the stack, and have the callee may modify that one without global side effects, and can happen with large structs.
Passing by reference means the data was already in main memory (we have a pointer to it, or can obtain one using the address-of (&) operator, and we're simply letting the function know about where to find it by providing a pointer to it, but since there is no copy any modifications will have global visibility.

Often enthusiastic programmers will approach whether to pass by value or by reference from a "performance" focused point of view, because they view copying the data in order to pass it as a costly thing to do. Sadly this is quite often terribly misguided:

Placing arguments in register is the fastest, since there is no chance for cache misses and register-to-register copies can even be elided in most cases.
Calling conventions typically deconstruct structs into their components, meaning they can fit them into registers unless there is a large number of them
If the compiler has sufficient guarantees about who writes to the memory, the copy when passing by value a large struct may be elided.
The actual cost of copying is almost never benchmarked and people assume it is relevant without checking. This often means the performance issues are imagined, not measured. Any decision made in the name of performance should be backed up by solid evidence and/or or solid arguments grounded in measurable things.

Now, SPIR-V is not your usual platform. You don't really get that low level of control, you have an LLVM-like IR, but contrary to the naming, that's not particularly low-level. In particular, heavy-handed optimization passes will be applied to your code inside the driver, so any micro-optimisation will typically be undone/made redundant/irrelevant.

SPIR-V shaders have specially crafted rules which allow implementing them without a stack at all: in particular the rules for logical pointers, the restrictions on call graphs against recursion, and the lack of function pointers ensure that lowering function calls by inlining them is always a viable approach.

In such scenarios, the distinction between passing by value or using a pointer is irrelevant, because once all functions have been inlined, logical pointers will be resolved to the underlying OpVariables and/or OpAccessElements, which means all the memory reads and writes can then be turned into virtual registers and phi nodes by the mem2reg optimization. (Also known as SSA conversion)

All of this bizzaro compiler talk means that in the end, talking about passing by reference or value is meaningless when talking about graphical shaders for the likes of OpenGL or Vulkan, because you just don't have that sort of low-level control. What you probably meant by that is that actual struct parameters are disallowed and only logical pointers to them may be passed to functions, resulting in inout-like semantics. Which, I hope I convinced you of, doesn't offer performance advantages, and I think others would agree is a terrible unnecessary footgun from a design/user experience perspective.

icculus commented 2 years ago

As others have pointed out, this is a big footgun in terms of usability. It corresponds to only allowing inout parameters in GLSL

Yeah, I think we're going to treat these as const by default (less for performance and more for compiler-enforced safety), and let programs specify some sort of by-ref flag for functions that need to modify a struct or array passed in by the caller. Since there are no pointers, const stuff can be in a global, or a register, or on the stack, or whatever a GPU offers...it doesn't really matter, it all looks the same to the program and we can do whatever works best and/or is available to us.

Plain-old-data will always be by-val and modifiable by default (changes do not affect the caller), because that's what people expect, although I guess we have to decide if we think built-in vectors and matrix types count as "plain old data".

bartwe commented 2 years ago

Please make UTF-8 the codepage for all shader source.

icculus commented 2 years ago

Please make UTF-8 the codepage for all shader source.

Yes, because it makes it so comments can be in any language and we don't have to worry about byte order marks or text editors (probably) mangling source code, but we can agree that we only want A-Z, 0-9, and underscore for actual identifiers? I don't want to mess around with...

for (🎉 = 🤪;  🎉 < 🤣; 🎉++) {}

...if we allow Unicode in identifiers.

kg commented 2 years ago

Please make UTF-8 the codepage for all shader source.

Yes, because it makes it so comments can be in any language and we don't have to worry about byte order marks or text editors (probably) mangling source code, but we can agree that we only want A-Z, 0-9, and underscore for actual identifiers? I don't want to mess around with...
for (🎉 = 🤪;  🎉 < 🤣; 🎉++) {}
...if we allow Unicode in identifiers.

I'm not sure how important this actually is for people from other cultures, but it might make sense to just allow a broader set of characters for identifiers but not stuff like emoji. I think unicode character classes make this feasible. I suppose it is also possible that the set of allowed characters could be expanded in updates though instead of being huge to begin with...

kg commented 2 years ago

For one example, ECMAScript specifies the valid characters in identifiers pretty concisely:

https://262.ecma-international.org/5.1/#sec-7.6

So we could do a stripped down version of that.

icculus commented 2 years ago

Realizing the example (and thus, the example grammar) has no global variables. Are these actually valuable in shaders (outside of globals like gl_Position in GLSL, and uniforms in GLSL/HLSL, which we handle differently here)?

icculus commented 2 years ago

So we could do a stripped down version of that.

Okay, this is interesting, I'll aim for something like this.

darksylinc commented 2 years ago

I'm not sure how important this actually is for people from other cultures,

I've seen a trend for CJK users to start using var names in their own languages, but they're all extremely used to writing code in ASCII anyway.

Outside of comments and string literals(*) UTF8 support would need its own very lengthy thread.

It can be tricky to define what a "letter" is. That's because even if we convert all characters to UTF-32, we still don't have a correspondence between 1 dword = 1 letter.

For example the character ö can be written in two ways: its precomposed form ö or its decomposed form o + ¨:

Precomposed ö codepoint U+00F6; codeunits C3 B6
Decomposed ö codepoints U+006F and U+0308; codeunits 6F CC 88 Note that there can be more than two codepoints. For example the glyph ṓ is made up from 3 codepoints, but should be rendered as a single character

The technical word of what constitutes a "letter" is probably an UTF "cluster".

This scenario actually has a solution: it's called UTF normalization (a pass where all decomposed forms get converted to its precomposed form).

However:

Not all characters have a precomposed form. e.g. the devangari letter पा is one cluster made up from codepoints U+092A and U+093E is stored decomposed (प + ा)
A cluster could be anything, for example the superscript word ᶜʰᵃʳᵃᶜᵗᵉʳ is technically 9 clusters (one for each letter in character).
We are not even touching RTL languages such as arabic and hebrew which have its own can of worms; even more so if RTL and LTR is mixed up together (e.g. ASCII characters in the middle of a hebrew variable name).

(*) String literals are actually supported by Vulkan's printf. Intended for debugging.

By the time you end up supporting everything required to "properly" (whatever that means) support UTF8 in variable & function names, then emoji support comes out automatically since it just uses the same system these languages use. Not to mention you need a full (very big) library like ICU & harfbuzz to parse all the Unicode details.

Most westerners just aren't familiar with them due to our reliance on the ASCII character set; and naively believe all languages are represented by UTF-32 and it's just emojis and stuff like ˢᵘᵖᵉʳˢᶜʳᶦᵖᵗ that uses the advanced functionality.

Thus basically: it is possible, but I strongly suggest that (outside of comments and string literals) it is left for a separate debate.

darksylinc commented 2 years ago

Realizing the example (and thus, the example grammar) has no global variables. Are these actually valuable in shaders (outside of globals like gl_Position in GLSL, and uniforms in GLSL/HLSL, which we handle differently here)?

Metal forbids almost all global variables (except some stuff like compile-time-defined samplers) even for UBO/TBO/textures and that has been a huge pain for us.

Because our HLSL/GLSL shaders assumed UBOs were global variables. We ended up fixing it with a few macros, but it's a bit hacky/nasty.

I know where it's coming from; because Metal was built to be more C-like in the sense that you can compile it into an object file and then reuse that object in multiple shaders at link time. To do that, it's much easier if you forbid global variables (otherwise the presence of an UBO bound at slot N means that UBO is assumed to be bound in all shaders it's linked to)

Now if you're asking read/write global variables (that aren't SSBOs or imageStore) then no, you should disallow those types of global variables (just like HLSL & GLSL)

Hugobros3 commented 2 years ago

GLSL does allow unqualified global variables, these simply use (thread)-private memory. I believe it's been a feature of GLSL since the very start, and you can find code in the wild that uses them. SPIR-V also supports creating global mutable private variables in this fashion.

Relevant spec quote: https://www.cs.uaf.edu/2006/fall/cs381/ref/GLSLangSpec_1.10.59.pdf#G3.467180

Modern GLSL spec: https://www.khronos.org/registry/OpenGL/specs/gl/GLSLangSpec.4.60.html#storage-qualifiers

darksylinc commented 2 years ago

Btw I just realized FVertexOutput may have modifiers such as flat / nointerpolate / noperspective / etc which aren't being covered.

icculus commented 2 years ago

I just pushed the first shot at a parser!

The latest in revision control, in this Issue's new home at https://github.com/icculus/SDL_shader_tools, can successfully parse the original example code I posted at the top of this thread. I need to incorporate a bunch of feedback from this thread still, but now we're talking about changing the parser instead of willing one into existence from nothing.

If you want to see it work, clone the repo, run CMake over it, and run

./sdl-shader-compiler -T mysource.shader

It'll spit out roughly the same source code, but that output is generated from the Abstract Syntax Tree that the parser generates.

(One change required: the main function has the @vertex thing moved to the end to match everything else that takes an @attribute.)

gingerBill commented 2 years ago

Hello.

I am really excited someone (that isn't me) is finally making a new shader language! Thank you for doing this.

I notice that some of the comments are criticizing the declaration syntax only such as var float3x3 Result = ...;.

I understand that this is done to keep it as C-like as possible with its pseudo type-focused declaration syntax but it adds extra noise for not added benefit. A simple approach would be consistent with the qualifier focused declaration syntax, which would make it look a lot more like Go or Rust.

The rest of the syntax and smenatics for the language can remain the same or be whatever you like, but when people judge a language, I have noticed that the first thing that they focus on is the declaration syntax. Even if two languages have wildly different semantics but have the same declaration syntax, people will judge them to be the same.

I have to done two mockups of your original code but with the newer declaration syntax and showing the benefits of what the newer syntax allows for too. Both declaration syntaxes will be easier to parse than any C-like type-first syntax, and as an added benefit allow for very basic type inference allowing the user to omit the type in the declaration if the type can be trivially inferred from the expression.

General idea:

// Original declaration syntax
var float3 foo = ...;
function float3 bar(float3 x @inputs) {...}
struct baz {
    float4x4 m;
    float4x3 bm[75];
};

// VERSION 1 (without : and ->)
var foo float3 = ...;
var foo = ...; // with type inference
function bar(x float3 @inputs) float3 {...}
struct baz {
    m  float4x4;
    bm [75]float4x3;
}

// VERSION 2 (with : and ->)
var foo: float3 = ...;
var foo = ...; // with type inference
function bar(x: float3 @inputs) -> float3 {...}
struct baz {
    m:  float4x4;
    bm: [75]float4x3;
}

The : and -> do not aid in parsing whatsoever (as you can tell) but do aid in reading, providing a distinct separation between the declaration and its type.

Minor note: I am the creator and main architect of the Odin programming language, and have created numerous different programming languages over the years. I know quite a lot of the pitfalls in terms of designs of languages. I also recommend reading an old article of mine On The Aesthetics of the Syntax of Declarations.

VERSION 1

Without : and -> (Go-like as some people will say)

// Qualifier-focused declaration syntax
// KEYWORD NAME TYPE
struct FVertexConstants
{
    ViewProjectionMatrix    float4x4;
    LocalToWorld            float4x4;
    WorldToLocal            float3x3;
    CameraPosition          float4;
    ScreenPositionScaleBias float4;
}

struct BoneMatricesConstants
{
    BoneMatrices [75]float4x3;
}

struct FVertexInput
{
    Position      float4 @attribute(0);
    TangentX      half3  @attribute(1);
    TangentZ      half4  @attribute(2);
    BlendIndices  int4   @attribute(3);
    BlendWeights  float4 @attribute(4);
    DeltaPosition float3 @attribute(5);
    DeltaTangentZ half3  @attribute(6);
}

struct FVertexOutput
{
    ScreenPosition    float4;
    CameraVector_FogA float4;
    Position          float4 @position;
}

func MorphPosition(Input FVertexInput) float3
{
    return Input.Position.xyz + Input.DeltaPosition;
}

func MorphTangents(Input FVertexInput) float3x3
{
    var Result float3x3 = 0;
    // Basic type inference based declaration
    var TangentX = float3((Input.TangentX / 127.5) - 1);
    var TangentZ = float4((Input.TangentZ / 127.5) - 1);
    var DeltaTangentZ = float3((Input.DeltaTangentZ / 127.5) - 1);
    Result[2] = normalize(float3(TangentZ.x, TangentZ.y, TangentZ.z) + DeltaTangentZ);
    Result[0] = normalize(TangentX - (dot(TangentX, Result[2]) * Result[2]));
    Result[1] = normalize(cross(Result[2], Result[0]) * TangentZ.w);
    return Result;
}

func SkinPosition(Input FVertexInput, BoneBuffer BoneMatricesConstants) float3
{
    var Position = float4(MorphPosition(Input), 1);
    var BoneMatrix = CalcBoneMatrix(Input, BoneBuffer);
    return Position * BoneMatrix;
}

func CalcBoneMatrix(Input FVertexInput, BoneBuffer BoneMatricesConstants) float4x3
{
    var BoneMatrix = Input.BlendWeights.x * BoneBuffer.BoneMatrices[Input.BlendIndices.x];
    BoneMatrix += Input.BlendWeights.y * BoneBuffer.BoneMatrices[Input.BlendIndices.y];
    BoneMatrix += Input.BlendWeights.z * BoneBuffer.BoneMatrices[Input.BlendIndices.z];
    BoneMatrix += Input.BlendWeights.w * BoneBuffer.BoneMatrices[Input.BlendIndices.w];
    return BoneMatrix;
}

func SkinTangents(Input FVertexInput, BoneBuffer BoneMatricesConstants) float3x3
{
    var Tangents = MorphTangents(Input);
    var TangentBasis float3x3;
    var BoneMatrix = CalcBoneMatrix(Input, BoneBuffer);
    TangentBasis[0] = float4(Tangents[0], 0) * BoneMatrix;
    TangentBasis[1] = float4(Tangents[1], 0) * BoneMatrix;
    TangentBasis[2] = float4(Tangents[2], 0) * BoneMatrix;
    return TangentBasis;
}

// I'm putting this attribute at the front compared to fields/parameters because it is easier for the user to read
@vertex
func VertexMain(Input FVertexInput @inputs, FVertexConstants Constants @buffer(0), BoneBuffer BoneMatricesConstants @buffer(1)) FVertexOutput
{
    var Output FVertexOutput;
    var WorldPosition = Constants.LocalToWorld * float4(SkinPosition(Input, BoneBuffer), 1);
    Output.ScreenPosition = Output.Position = Constants.ViewProjectionMatrix * WorldPosition;
    var TangentBasis = SkinTangents(Input, BoneBuffer);
    var WorldVector = Constants.CameraPosition.xyz - WorldPosition.xyz * Constants.CameraPosition.w;
    Output.CameraVector_FogA.xyz = TangentBasis * (Constants.WorldToLocal * WorldVector);
    Output.CameraVector_FogA.w = 0;
    return Output;
}

VERSION 2

With : and -> (Rust-like as some people will say)

// Qualifier-focused declaration syntax
// KEYWORD NAME TYPE
struct FVertexConstants
{
    ViewProjectionMatrix:    float4x4;
    LocalToWorld:            float4x4;
    WorldToLocal:            float3x3;
    CameraPosition:          float4;
    ScreenPositionScaleBias: float4;
}

struct BoneMatricesConstants
{
    BoneMatrices: [75]float4x3;
}

struct FVertexInput
{
    Position:      float4 @attribute(0);
    TangentX:      half3  @attribute(1);
    TangentZ:      half4  @attribute(2);
    BlendIndices:  int4   @attribute(3);
    BlendWeights:  float4 @attribute(4);
    DeltaPosition: float3 @attribute(5);
    DeltaTangentZ: half3  @attribute(6);
}

struct FVertexOutput
{
    ScreenPosition:    float4;
    CameraVector_FogA: float4;
    Position:          float4 @position;
}

func MorphPosition(Input: FVertexInput) -> float3
{
    return Input.Position.xyz + Input.DeltaPosition;
}

func MorphTangents(Input: FVertexInput) -> float3x3
{
    var Result: float3x3 = 0;
    // Basic type inference based declaration
    var TangentX = float3((Input.TangentX / 127.5) - 1);
    var TangentZ = float4((Input.TangentZ / 127.5) - 1);
    var DeltaTangentZ = float3((Input.DeltaTangentZ / 127.5) - 1);
    Result[2] = normalize(float3(TangentZ.x, TangentZ.y, TangentZ.z) + DeltaTangentZ);
    Result[0] = normalize(TangentX - (dot(TangentX, Result[2]) * Result[2]));
    Result[1] = normalize(cross(Result[2], Result[0]) * TangentZ.w);
    return Result;
}

func SkinPosition(Input: FVertexInput, BoneBuffer: BoneMatricesConstants) -> float3
{
    var Position = float4(MorphPosition(Input), 1);
    var BoneMatrix = CalcBoneMatrix(Input, BoneBuffer);
    return Position * BoneMatrix;
}

func CalcBoneMatrix(Input: FVertexInput, BoneBuffer: BoneMatricesConstants) -> float4x3
{
    var BoneMatrix = Input.BlendWeights.x * BoneBuffer.BoneMatrices[Input.BlendIndices.x];
    BoneMatrix += Input.BlendWeights.y * BoneBuffer.BoneMatrices[Input.BlendIndices.y];
    BoneMatrix += Input.BlendWeights.z * BoneBuffer.BoneMatrices[Input.BlendIndices.z];
    BoneMatrix += Input.BlendWeights.w * BoneBuffer.BoneMatrices[Input.BlendIndices.w];
    return BoneMatrix;
}

func SkinTangents(Input: FVertexInput, BoneBuffer: BoneMatricesConstants) -> float3x3
{
    var Tangents = MorphTangents(Input);
    var TangentBasis: float3x3;
    var BoneMatrix = CalcBoneMatrix(Input, BoneBuffer);
    TangentBasis[0] = float4(Tangents[0], 0) * BoneMatrix;
    TangentBasis[1] = float4(Tangents[1], 0) * BoneMatrix;
    TangentBasis[2] = float4(Tangents[2], 0) * BoneMatrix;
    return TangentBasis;
}

// I'm putting this attribute at the front compared to fields/parameters because it is easier for the user to read
@vertex
func VertexMain(Input: FVertexInput @inputs, FVertexConstants Constants @buffer(0), BoneBuffer: BoneMatricesConstants @buffer(1)) -> FVertexOutput
{
    var Output: FVertexOutput;
    var WorldPosition = Constants.LocalToWorld * float4(SkinPosition(Input, BoneBuffer), 1);
    Output.ScreenPosition = Output.Position = Constants.ViewProjectionMatrix * WorldPosition;
    var TangentBasis = SkinTangents(Input, BoneBuffer);
    var WorldVector = Constants.CameraPosition.xyz - WorldPosition.xyz * Constants.CameraPosition.w;
    Output.CameraVector_FogA.xyz = TangentBasis * (Constants.WorldToLocal * WorldVector);
    Output.CameraVector_FogA.w = 0;
    return Output;
}

icculus commented 2 years ago

First shot at semantic analysis just landed in revision control; got a little to clean up in there and then I'm going to start incorporating feedback from this thread.

icculus commented 2 years ago

Opinions on constructors, anyone?

The language is fairly strict about not implicitly coercing datatypes to match (assigning an int to a float will fail, you can't do if (1) because it's not a bool, etc). There is a little syntactic sugar to make things less ugly in one specific case, though: int literals will promote to various things:

var float x;
var int y = 5;
x = y;  // compiler error, you can't assign an int to a float!
x = 5;  // this will work, though, but only because it's a literal.
var float4 z = y;   // an int is not a float4.
z = 5;  // the compiler understands this literal is shorthand for float4(5.0,5.0,5.0,5.0)
// the real reason for this:
var float4 a = z / 5;  // this looks nice and clear instead of having to build a float4 to divide.

(float literals will do this too, but only promotes to vector and matrix types, not integers, bools, etc.)

That digression aside, GLSL is also strict in similar ways, but has a constructor syntax for when you need to cast, and it's the wild west: basically you can construct anything from almost anything:

bool x = bool(232);  // becomes true
vec4 f = vec4(1);  // becomes 1,1,1,1
vec4 u = vec4(2,3);
vec4 z = vec4(y.xy, f.z, float(true));

How permissive do we want this to be? Do we want this to be "almost anything can construct anything" as a guarantee you won't get stuck in a situation where you can't reasonably get your data into the necessary format, or do we want to impose rules of some sort?

We can be as strict as "you can't construct an int from a bool, even though it would be obvious for true to become 1 and false to become 0, use the ? operator instead", or we can be as loose as "if you construct a float4 from a user-defined struct, it'll use the fields until it has enough elements to fill in the whole vector and default to zero for the fields that can't be initialized because there weren't enough fields in the struct."

icculus commented 2 years ago

(We can be whatever we want to be, we don't have to say "GLSL/HLSL/MSL does this, so we must match it perfectly." If GLSL's constructors are a footgun, let's do our own thing.)

kg commented 2 years ago

I would start conservative, you can always relax the restrictions in future versions of the language. I think coercing ints or floats to bools is a no-go and I think the vec constructors should accept the correct number of elements, no more or no less (assigning a scalar TO the vector feels like the right way to set all the elements). It's too easy to end up with weird bugs due to GLSL and HLSL's behaviors here.

icculus commented 2 years ago

assigning a scalar TO the vector feels like the right way to set all the elements

I think there's going to be way to much temporary object creation that wants to do...

myfunction(float4(splat));

...without an explicit temporary var and an assignment statement, so I don't think we can avoid it completely.

The other thing I see all over the place, that feels like fingernails on a chalkboard to me despite its popularity, is this...

x = float4(myvar.xyz, 1.0);

...and once you have that, it's pretty much open season.

I'm still fighting with this.

darksylinc commented 2 years ago

assigning a scalar TO the vector feels like the right way to set all the elements

As much as I hate it, NOT doing this has saved me a lot of bugs.

The following should be ok:

float4( float, float, float, float )
float4( float3, float )
float4( float2, float2 )
float4( float2, float, float )

These are in hot water:

float4( float, float3 )
float4( float, float, float2 )

These are not ok:

float4( float )
float4( float2 )
float4( float3 )

It may seem arbitrary, but it just comes from experience. The thing is that this hides a lot of bugs:

result = float4( myValue );

When myValue can be either float or float4 (or worse if float3 & float2 are allowed). Nothing worse than realizing result.xyzw == result.xxxx because myValue is accidentally a scalar.

The following idiom will compile correctly and avoid bugs:

result = float4( myValue.xxxx );

Metal unfortunately doesn't allow this for float (because float is a native datatype while floatN is a template class) but I think that was mistake.

Additionally IMO swizzles on literals should work:

result = float4( 4.0f.xxxx );

Or in any case:

result = float4( float( 4.0f ).xxxx );

Open Question

In my experience, although more verbose, I've learnt to appreciate using explicit swizzles.

For example:

float3 myValue;
r = float4( myValue, anotherVar );

float3 myValue;
r = float4( myValue.xyz, anotherVar );

Functionally they're the same. However it may be worth debating if you want to forbid r = float4( myValue, anotherVar ); while allowing r = float4( myValue.xyz, anotherVar ); since the former hides a potential bug (are myValue & anotherVar both float2? or float3 and float? or viceversa?) while the latter leaves no room for doubt (myValue.xyz is float3, anotherVar must be float).

kg commented 2 years ago

Explicit swizzles for all these cases could be a good compromise. They're not super verbose but they clearly state intent. Hypothetical question: If you were to require val = float4(foo.xxxx) what's wrong with val = foo.xxxx at that point, assuming you prohibit val = foo (i'm fine with this)?

The float4(a.xyz, w) case is common enough that I'm receptive to the need to support it without having things explode out into all sorts of gross permutations. @darksylinc's suggested list of patterns seems good to me - the rule can generalize to '1-N parameters containing a total of N scalars, where the size of each parameter is equal or less than the size of the previous parameter'. That prohibits stuff like float4(1, r.xyz) and such happening by accident (they can always recreate it at 0 execution cost with a swizzle).

icculus commented 2 years ago

Heretical thought: should we require flow control statements to use {}?

So this wouldn’t work:

if (x) do_something();

But this would:

if (x) { do_something(); }

This way there’s no risk of this mistake:

if (x);  // uhoh, the following statement block always runs!
{
   do_something();
}

But this might drive people positively nuts. But if we’re killing sacred C cows to remove C footguns, this would be an easy fix.

icculus commented 2 years ago

But if we’re killing sacred C cows to remove C footguns, this would be an easy fix.

I killed the sacred cow this morning; braces required now.

While I'm issuing decrees here: since we need braces and expressions can't be statements (etc), it's not ambiguous if you don't have parentheses around the condition expression, so they're optional, now...

if x == 1 {   // now legal!
    do_something();
}

C aficionados can wrinkle their nose then ignore this, since an expression can be wrapped in parentheses for the expected effect without any changes to the parser:

if (x == 1) {    // like C requires, it doesn't make any difference to us.
    do_something();
}

Now I'm sitting here deciding if the function signature should allow void:

function int x(void) { return 1; }   // this is currently legal.
function int x() { return 1; }   // this is also legal, but maybe it should be the only thing that is...?

vanka78bg commented 2 years ago

Call me ignorant, but is inventing yet another high-level shader language strictly necessary? Is it possible to just adopt SPIR-V as an intermediate shader format instead? Then the developers can use any language they like, as long as there is a compiler to output SPIR-V for it. I admit, I have not researched if this is even remotely feasible, especially on Apple platforms, but it seems theoretically possible at least.

Creating your own shader language seems like too much of a duplicated effort to me. Additionally, you should have to maintain it forever, adding new features constantly, every time a new video card with a new shader model is introduced.

kg commented 2 years ago

I can't speak for Ryan, but adopting SPIR-V as your official IR and (presumably) GLSL or HLSL as your official language has problems. Here are some reasons I can think of for a custom language:

SDL_gpu users will attempt to use features from GLSL or HLSL and file bugs when they don't work.
SDL_gpu users will have to install and keep up-to-date the whole toolchain (GLSL/HLSL compiler, spirv-cross, etc) and navigate compatibility issues
You need to maintain some amount of adaptor/interop layer for each of your supported targets just to map the SPIRV to that target
SDL_gpu has its own specific feature set, so a custom language can map well to that feature set and the intended way to use it - for example perhaps GLSL or HLSL have multiple ways you can represent uniforms, while SDL_gpu's shader language will make sure you do uniforms and constants right every time.
SDL_gpu's shader language can maintain a lower barrier to entry than other languages, much like how SDL has a low barrier to entry for casual developers.
<possible legal hazards censored so you don't have to think about them>

While I disagree with their decision for other reasons, the WebGPU spec committee also opted to do stuff custom instead of adopt GLSL+SPIR-V for multiple reasons. Sadly it is a decision with precedent so it's not weird for SDL_gpu to do it. I don't personally want to learn another language but if SDL_gpu is going to have a custom one, it should be good :-)

The need to maintain it forever adding new features is overstated - SDL_gpu doesn't have to support every new feature. If you want to use a bleeding edge Vulkan extension, you're probably not using SDL_gpu to begin with.

fwiw I've been using d3d9/ps3_0 for like 12 years now (first via XNA, now via FNA+SDL2+MojoShader) and while it's annoying to have limitations, I've been able to do high spec experimental graphics stuff with that toolchain for a long time. I think people underestimate what the average developer can get done if they have a good toolchain and SDL+SDL_gpu will likely empower a lot of people to focus on being productive.

Arcnor commented 2 years ago

For what it's worth (and just because I want to be part of this thread in case it becomes historical or something :P), adding to @kg 's excellent points above, BGFX already does what @vanka78bg mentions, using GLSL as its base and it's not really a pleasant experience.

Besides a completely lack of documentation about the actual incompatibilities with "raw" GLSL (which is not GLSL's fault of course), it converts GLSL to GLSL (for optimizations I think), ESSL, SPIRV, HLSL and Metal (using HLSL as a base for everything except GLSL/ESSL IIRC), but you always have different gotchas depending on what the target actually is and you need to change your GLSL slightly depending on unknown rules.

Granted, this doesn't mean it cannot be done better, but given the "market share" BGFX already has and the base projects it uses to do all this (I don't think you'd be reinventing the wheel if you go this route, but using existing compilers I'm assuming) I really doubt you'll get a different result unless you write everything from scratch, and that's already a similar (and huge!) amount of work just to shoehorn GLSL into all of this, at which point creating your own flavor with your own limitations seems to be a better solution.

Just my (hopefully slightly informed) 2c.

icculus commented 2 years ago

This may or may not be interesting: https://twitter.com/icculus/status/1557753925469671425

icculus commented 2 years ago

This may or may not be interesting: https://twitter.com/icculus/status/1557753925469671425

1129 votes total, was almost exactly 70/30 at any given hour of the poll in progress.

icculus commented 2 years ago

I ended up letting both forms be valid, so use what you like!

https://github.com/icculus/SDL_shader_tools/blob/e2b3df139ac4118d9894bd7c4b54b703d324bac1/docs/README-shader-language-quickstart.md#variable-declaration-can-be-c-like-or-not

icculus commented 2 years ago

I don't know how widespread switch statements are in shaders in general, but do we want to remove fallthrough cases as a footgun?

This thing:

switch (x) {
    case 1:
         do_something();
         /* there's no `break` statement here */
    case 2:
         do_something_else();
         break;
}

Where if x == 1, this will call both do_something() and do_something_else().

This can be useful, but it can also be a disaster, as everyone knows. Should we require cases to end with a break/return/discard, or require them to be wrapped in braces that imply a break, or maybe just even imply a "break" at the end of a case without braces?

(or just dump the switch statement entirely? Or leave it as-is? There are arguments for every possibility on this one.)

darksylinc commented 2 years ago

I lack enough understanding to offer advise other than

Historically switch statements had been deemed as slow (if ladders, divergence, etc). We're talking > 10 year habits
Later certain usages of switch statements could degrade into high register pressure as a worst case. I can't remember if lack of default statement was good or worse, but it had an impact. If/else are more predictable. However for simple cases there's no difference (perf speaking)

icculus commented 2 years ago

if ladders

I'm certain on some targets, this is absolutely going to degrade to a collection of if-statements.

But like I said, the argument might be not to fix switch syntax, but to remove the switch statement outright (and revisit in version 2 if people really miss it).

Andre-LA commented 2 years ago

About fallthrough:

Both on nelua and odin, you need to use the fallthrough keyword instead of break in order to use fall through.

This way, accidental fall through is avoided.

icculus commented 2 years ago

I've started work on the bytecode format, so if anyone has opinions on that sort of thing, discussion is sitting over in #6.

rudolfwalter commented 2 years ago

I ended up letting both forms be valid, so use what you like!

I strongly dislike the idea of having two different syntaxes for doing the same thing, especially something as common as declaring variables. I believe the inconsistency will be confusing for beginners, and annoying for everyone else.

I would vote to pick one syntax and stick with it, whichever it is.

hw-claudio commented 1 year ago

It seems VERSION 1 is simpler and with less clutter so I'd use that but I don't have a strong opinion there.

In general I wanted to share that the idea of SDLSL is cool, and I was waiting for something like this, an in-SDL way to deal with pixel shaders, to substitute a lot of custom cpu code to do sprite outlines, effects etc in my 2D engine and games.

Whatever you do, I'd just ask to keep it as simple as possible; if one wants access to super complex features and have a million options, they can use vulkan, Direct3D, OpenGL etc, but at least for me (solo dev, working on my own 2D engine and small games in my free time) simple and uncluttered is the main thing. Just like SDL is.

I currently use SDL2 with plain C and with its builtin texture render API, it has the minimum needed and I love it for it.

I also use SDL_image to load pngs, and SDL_mixer for the simple audio part of the engine.

If SDLSL were available I would use it to make the existing stuff more efficient and simple, doing outlines, shading the sprites, some water effects etc.

Thanks and can't wait to use it!