o3de / o3de-azslc

Amazon Shader Language (AZSL) Compiler
Other
23 stars 14 forks source link

Update AZSLc (and O3DE) To Support HLSL 2021 Features #45

Open santorac opened 2 years ago

santorac commented 2 years ago

O3DE already uses the latest release of DXC: 1.6.2112. This includes preview support for the new HLSL 2021 feature set (see https://devblogs.microsoft.com/directx/announcing-hlsl-2021/) but AZSLc does not support these new features. I haven't tested this, but I suspect that at least some of the new features will fail to be parsed by AZSLc and will need upgrades to be passed through to the HLSL output.

We should update AZSLc's grammar to support these new features, and cut a new release. To demonstrate the new features we should look for some ways to improve existing shader code in O3DE, maybe templatize some utility functions

Note that there is a command line option for DXC to enable these features: "-HV 2021". We will need to update O3DE's configuration to pass this command line option when running DXC. See Gems\Atom\Asset\Shader\Config\shader_build_options.json

siliconvoodoo commented 2 years ago

I'm thinking about this task, and I have to say, I still have to shape it in my brain enough so as to be sure about what can be done, at what cost, and also what should be done (is potentially orthogonal).

So allow me to put the bits --necessary for brainstorming-- down:

Template functions and data types

This is tricky. So the problem resides in multiple facets:

  1. There is in fact a stronger isolation between AZSL and HLSL than we may be led to believe by AZSL's touch and feel. AZSlc does re-emission therefore, improvements in HLSL front end (templates) are not necessarily useful, since we can "flatten" the emitted code to concrete template-instanciations on the other side. This is by the way, a good thing if we need to target different compilers than DXC.
  2. AZSL also has a spirit of its own. Its inspiration originates from slang which derives from swift which is itself a sort of convergence from lessons learnt from C++/D/Haskell and lambda calculus theory (System F, SOL). Which we tend to find in modern languages like Nim or typescript. I don't know if we must absolutely revere the dogma of saint HLSL just because. At least, a proper community discussion should be in order. The choice is to adopt generics with existential semantics (sorta like C++20's concepts), versus templates.
  3. The difficulty of implementation of the template path. Indeed, templates are no walk in the park, they are turing complete, they are a functional meta language. Which means, support implies horrendous test suites and possibly lots of time of development in the semantic orchestrator class.

I suppose point 3 can cause some head scratching, especially "why would AZSLc need to be fully semantically aware of the template program?". And I suggest that this a point that needs more brain juice and possibly data gathering too, Let's imagine outcomes:

  1. I can prove it, and thus we accept this is too expensive to support. One solution would be to restrict them enough, so that the little feature that is left accessible is still interesting, AND cheap to implement.
  2. I can't prove it, and we can find a hack that bypasses full semantic understanding and still get a lot of juice for cheap.

Overloadable operators for user defined data types

That is probably not a can of worms; the problem is to preserve the quality of type tracking but I'm thinking that it's doable.

Bitfield members for data types

I'm afraid for all the packing/padding/alignment rules that we're going to have to update because of that.

Strict casting rules for user defined data types

That is actually already the case in AZSL, AZSL never tolerated type equivalence through layout. So on the contrary, this will give us more robustness for free since the DXC-input will not be ambiguous in places when we re-emitted naively.

Logical operator short circuiting for scalar types

seems like a llvm thing. no impact for us?

C++ for loop scoping rules

I'm appalled that it wasn't the case before. I'll need to look in detail what really changed. EDIT: 'did look

int I = 2;
for (int I = 0; I < 10; ++I) {
  // make pretty colors
}
// I == 10!!!

This is properly insane. AZSL never behaved like that, so this is again a fix we're benefitting from that robustifies the output understanding by DXC.

santorac commented 2 years ago

@siliconvoodoo thanks for your analysis. I believe the original motivation for this task was to incorporate support for template functions. The (incorrect) assumption was that doing so would be relatively easy because we could rely on DXC to do the heavy lifting. In addition to what you explained, I imagine AZSLc would have to do template expansion in order to properly transpile SRG references possibly do other dependency tracking, so there would be little benefit or even opportunity to rely on DXC's template expansion. Perhaps @jeremyong-az and fill in other background regarding the motivation behind this task.

In my view, this task could be undertaken if it is low-hanging fruit (which apparently it is not), otherwise there are higher priorities.

jeremyong-az commented 2 years ago

@siliconvoodoo Thanks for the comment replies to follow:

I suppose point 3 can cause some head scratching, especially "why would AZSLc need to be fully semantically aware of the template program?".

I would argue that it shouldn't, and that it simply needs to be lexically aware enough that AZSLc can extract binding/reflection data. Realistically, HLSL/DXC continue to make progress, and I think it's not sustainable to mirror their efforts, or attempt to provide our own validation/warning/error layer on top. The value AZSLc provides is an RHI-agnostic, consistent, scheme to map shader resource and binding schemes across platforms in a user-friendly way. As upstream syntax changes roll out, having AZSLc operate as a thinner layer on top of DXC is desirable in my mind, so as to work in a more complementary rather than competing fashion. I would argue that trying to exhaustively expand the test suite with comprehensive coverage over template handling is counterproductive. If we could simply pass-over template expressions to extracting reflection information, replace variables as needed and pass the rest of the shader over to DXC, that would be ideal.

I don't know if we must absolutely revere the dogma of saint HLSL just because. At least, a proper community discussion should be in order. The choice is to adopt generics with existential semantics (sorta like C++20's concepts), versus templates.

We should certainly discuss with the community, but in my mind, this is a moot point until there is sufficient appetite to actually subsume this work. Templates, while not always a perfect solution by any means, solve certain problems immediately. For example:

// Fresnel factor
// Specular reflectivity increases at grazing incident angles
// The f0 parameter represents specular reflectance at normal incidence and is
// achromatic for dieletrics but chromatic for metals.
// The f90 parameter represents the reflectance at grazing angles.
template <typename T, int N>
vector<T, N> f_schlick(float u, vector<T, N> f0, T f90)
{
    vector<T, N> f90_t = f90;
    return f0 + (f90_t - f0) * pow(1.0 - u, 5.0);
}

Here's a simple function from my home renderer which I upgraded to a recent DXC early this year. This function works for an RGB vector, a greyscale component, for a single precision float, a half type, etc. resulting in an immediate reduction in code. Similar examples can by found all over the place, but the most common theme will be reusing code to accommodate different vector/matrix sizes and scalar types.

I'm afraid for all the packing/padding/alignment rules that we're going to have to update because of that.

I don't know that we'd need to do anything except track the total size of the bitfield entry and the offset. The packing/padding/alignment rules should be the same (legacy rules for cbuffer, tight packing for everything else). As an initial pass, I would suggest that we shouldn't need to reflect the bitfield members themselves since they have uses outside of shader bindings.

siliconvoodoo commented 2 years ago

I would argue that it shouldn't, and that it simply needs to be lexically aware enough that AZSLc can extract binding/reflection data.

It shouldn't, but we couldn't do symbol tracking without a serious semantic comprehension. I also would have preferred that "it shouldn't", but respecting HLSL/C++ symbol lookup rules to be able to figure out references to SRG resources in the code turned out to require a full blown symbol tracking system. Because they integrate inside the source code, with partial qualification, full qualification, without qualification and thus relative to context, local variables may shadow them, and they can appear through SRG typealias, or member object or function return. All of the above created a obligation to be able to resolve types, function overloads and a "reasonably namespace aware symbol lookup system without ADL".

To illustrate a case in just that small example:

ShaderResourceGroup SRG1 : Slot1
{
    int m;

    int F()
    {
        int m;
        return m;
    }
}

the understanding of what "m" is inside F() is crucial to differentiate between this correct emission:

cbuffer {
int srg1__m;
}

srg1__F()
{
    int m;
    return m;
}

and this naïve but incorrect one:

srg1__F()
{
    int m;
    return srg1__m;
}

Also, as you can see by referring to the test tests\Emission\name-collision-avoidance.azsl understanding of symbol references allows for robust reemission free of conflict with potentially similarly named global statics, or globally vs SRG defined UDT.

Below, this is one of the mid-dev historical captures that shows the kind of discernment that I figured was required of azslc, to look like a language and not like a regexp that can fail in hundreds of ways when there is no context understanding: image

Realistically, HLSL/DXC continue to make progress, and I think it's not sustainable to mirror their efforts

That is true and I agree, but until now we got cornered by the quirks of the HLSL/C++ language lookup rules and the implementation that emerged ended up heavy because pulling on a thread brought a whole universe of semantic to solve. It could possibly be reversed into the original regexp prototype, at the cost of changing the requirements on symbol references. Perhaps, we can change what we ask of shader authors: "in azsl 2 it is no longer possible to refer to your SRG resources using natural symbol lookup, but you are required to prefix by @ symbol and fully qualify in all situations." Then the reemission could use a namespace for all the types and functions inside an SRG instead of emitting a flattened-to-global-scope code, and maybe avoid all need for reference tracking altogether, but that needs a bit of brainjuice prior just to identify pitfalls ahead. Any language works always require this effort of imagination to figure the contradictions, conflicts or dead-ends we risk running into.

This was the prior goal for symbol flattening system, as captured from slang behavior: image (A job currently done by SymbolAggregator::ReorderBySymbolDependency which to function, needs robust symbol reference tracking, which in turn needs type understanding which in turn need function overload resolution, and a sort of spanning-tree graph algorithm in class hierarchies to solve hidden methods from overrides (exposed in --visitsym CLI option) etc...)

EDIT: I found the document image of the homonym system. (it's used by the mutating re-emission system (RegisterLandingScope) to rename SRG functions that belong to the family) image

Templates, ..., solve certain problems immediately.

Certainly. Adopting them requires to think about the above suggestion, or actually implementing a subset.