Open siliconvoodoo opened 1 year ago
I think this is a step in the right direction, to remove the need for complex automatic name mangling, at the cost of some need for developer awareness. I do have some questions though...
First, I don't quite understand what you are saying about collision avoidance. Under the current automatic system, yes there is some need for collision avoidance, because the user might have their own variable already named SRG_m_uniforms
. But if we remove the automatic mangling and we are expecting the developer to be aware of the implicit names, then I think it would okay to also expect the user to avoid having some other symbol with the same name. If DXC compilation fails, the developer should just go rename their variable.
Second, even though we do expect shader authors to be aware of what's going on with SRG field names, I am a little concerned about discoverability for less familiar users who are just reading shader code. If they find the symbol SRG_m_uniforms
and do a find-in-files for that name, they won't be able to discover the declaration site. I can think of a couple ways to mitigate this, I'll put my suggestions separately in comments below.
Suggestion 1: By convention, the author will manually define accessor functions for all SRGs.
ShaderResourceGroupSemantic slot1
{
FrequencyId = 1;
};
ShaderResourceGroup SRG : slot1
{
struct CB
{
float4 color;
};
ConstantBuffer<CB> m_uniforms;
};
float SRG_GetColor()
{
// Note, automatic code-gen will transform SRG's m_uniforms to this
return SRG_m_uniforms.color;
}
The the rest of the code can just use the accessor function, so if someone reading the code searches for "SRG_GetColor" they will arrive at the SRG definition.
#include "inputs.hlsl"
float4 MainPS( float2 uv :TEXCOORD0) :SV_Target0
{
return SRG_GetColor();
}
Suggestion 2: Don't do any automatic renaming. The developer will write standard resource declarations, without register spaces, and all that AZSLc will do is figure out the registers and alignments. There can be sentinels in comments for marking the start and end of the SRG.
ShaderResourceGroupSemantic slot1
{
FrequencyId = 1;
};
//{{ShaderResourcesGroup : slot1 }}
struct SRG_CB
{
float4 color;
};
ConstantBuffer<SRG_CB> SRG_uniforms;
//{{End ShaderResourcesGroup}}
Suggestion 3: Similar to suggestion 2, but we use a namespace to wrap the SRG and just use an attribute or comment sentinel to mark it as an SRG
ShaderResourceGroupSemantic slot1
{
FrequencyId = 1;
};
[ShaderResourcesGroup(slot1)]
namespace MaterialSrg
{
struct CB
{
float4 color;
};
ConstantBuffer<CB> uniforms;
}
if we remove the automatic mangling and we are expecting the developer to be aware of the implicit names, [..] it's ok to also expect the user to deal with collisions
Agreed. Also I like suggestion 2, I added it to the RFC. And will continue to refine it later.
Edit history:
Report on how it's going.
The prototype (https://github.com/SiliconStudio/o3de-azslc-evo) is following the red development curve of the RFC. The horizontal part (additive coding) being done: the support of namespaces in the front end. Yesterday I managed to get the system to support reopening of namespaces (akin to partial SRG). And the emission to respect the original pattern (multiple namespaces). I still have an assert when using nested namesapces. It takes effort to preserve the original program structure in the current emission system. Because it is based on looping over the "ordered symbol apparition list" and recreate the code. (Let's call that recreation: "reformatting AST expansion").
On a dev curve like the yellow curve we would have a minimalist emission system based on token reproduction, and possibly alteration and insertion (to add implicit struct, and option getters). But the emission would be more of a copy (like we do now for function bodies), than a reformatting AST expansion. Anyway that route would cost a lot more effort, to reproduce all the logic about the --cb-body
option, the srgConstant struct construction, pound-line directives, the extended type operators.
Also any route of this RFC aims at doing without the ref tracker, but the option getters have a system reliant on the reference tracker that is causing option references to mutate to a call to the getter function. That will have to be probably un-abstracted, if we lose the ref tracker (seenat) requiring the programmer to use the getters right from the original source.
Srg constants also undergo the same sort of reference mutation (to mention the implicit generated struct and access its member)
Same for cb-body option since it mutates references that don't use [0] subscript to get the subscript. (and quite cleverly since it has a recursive detector to find its presence. look for FindRuleInAstThatIs0AwayWrtPointerDistance
in the source).
cb-body IIRC is meant for support of provo so it is also a question if that platform can be dropped maybe the whole support in azslc can to.
Example:
ShaderResourceGroupSemantic slot1
{
FrequencyId = 1;
ShaderVariantFallback = 128;
};
option bool reflections = false;
[[SRG("slot1")]]
namespace MaterialSRG
{
struct CB
{
float3 sceneBounds;
};
ConstantBuffer<CB> uniforms;
float4 fogClr;
float4 skyClr;
float3 sunDir;
Texture2DMS<float4, 8> fresnel;
enum Composite { Spec, Diff };
Composite Get() { return Spec; }
CB Get(float m) { return uniforms; }
// crash for now:
// namespace SubNS
// {
// float4 GetIBL(float3 dir){ return saturate(dot(dir,sunDir)) * skyClr; }
// }
};
namespace MaterialSRG // reopening an srg-namespace should cause a natural "partial" behavior
{
TextureCube IBL;
sampler s;
}
float4 PSMain2(float2 uv : TEXCOORD0) : SV_Target0
{
return float4(MaterialSRG::Get(0).sceneBounds, 1) * MaterialSRG::IBL.Sample(MaterialSRG::s, float3(uv,0));
}
results in:
bool GetShaderVariantKey_reflections();
#if defined(reflections_OPTION_DEF)
static const bool reflections = reflections_OPTION_DEF ;
#else
static const bool reflections = GetShaderVariantKey_reflections();
#endif
namespace MaterialSRG
{
struct CB
{
float3 sceneBounds;
};
enum Composite
{
Spec,
Diff,
};
::MaterialSRG::Composite Get()
{
return Spec ;
}
::MaterialSRG::CB Get( float m)
{
return uniforms ;
}
}
namespace MaterialSRG
{
Texture2DMS<float4, 8> fresnel : register(t0, space0);
TextureCube IBL : register(t1, space0);
SamplerState s : register(s0, space0);
struct _srgInfoSymbol__srgConstantsStruct
{
};
ConstantBuffer<::MaterialSRG::_srgInfoSymbol__srgConstantsStruct> _srgInfoSymbol__srgConstantBuffer : register(b0, space0);
ConstantBuffer <::MaterialSRG::CB> uniforms : register(b1, space0);
}
float4 PSMain2( float2 uv : TEXCOORD0) : SV_Target0
{
return float4 ( MaterialSRG :: Get ( 0 ) . sceneBounds, 1 ) * MaterialSRG :: IBL . Sample ( MaterialSRG :: s, float3 ( uv, 0 ) ) ;
}
bool GetShaderVariantKey_reflections()
{
uint shaderKey = (::MaterialSRG::m_SHADER_VARIANT_KEY_NAME_[0].x >> 0) & 1;
return (bool) shaderKey;
}
I got the SRGInfo to register at the last apparition of an SRG namespace and emit at its end. This way it can depend on types defined above, like struct CB
. Unfortunately it aggregates view and CB types and emit them all late. So the two Get() functions above now can't compile since they depend on symbol declared after. This is not solvable, unless we break the srginfo emission into a each-srg-member emission independently, and we run a pass of the symbol-dependency-reorder system to attempt to solve accesses to srg-constants which get migrated to a struct anyway. (that system is deactivated now as its on the diet list). A similar problem also exist in version 1 though, it's covered by the document about partial
: https://www.o3de.org/docs/atom-guide/dev-guide/shaders/azsl/#partial-shaderresourcegroup-definitions
On the result example above we can also see the _srgInfoSymbol__srgConstantsStruct
struct is empty, because it gets constructed during a call to SetupScopeMigrations
(link to code) which is also on the diet list. I'll see if I can hack a version of that function that doesn't rely on scope migrations, but just setups custom behaviors.
I believe that in the light of all that investigation data, the cost of making the evolution project is in the order of magnitude of running after and integrating N=2 to 3 HLSL language evolution features in azslc1.x At least it seems that making templates work in azslc1.x will be cheaper than this RFC direction. How many N are we investing for might be the question here.
That said, even in the case of a conservation of the azslc1.x lineup, I am seduced by some elements of this RFC.
For example, I think we should pursue the C.Santora's syntax (concept idea 2) with the namespaces, and implement a unification of the emission method of option, rootconstant & srg-constants (generalize the option
way).
Advantages:
🆒 render azsl hlsl-compatible (attributes excepted)
🆒 diminish radically the dependency on flawlessness of the seenat system. (waves potential problems of the lack of 2 pass scanning as described in this documentation addition)
💯 will give opportunity to get rid of the partial
declaration/usage dilemma limitation
🆓 Remove the need for symbol scope migration
🆓 Very probably frees us from the topological solver ReorderBySymbolDependency
🆒 Improve touch&feel by DXC reporting errors related to symbols that will have the same name than in the original azsl source name
🆒 namespace
support in the front-end can be retroported to azslc1.x from the work in the prototype
That would bring the azslc1.x lineup that much closer to what's necessary for allowing free language evolution, in a more iterative fashion than the sudden fork necessitated by red/yellow/green paths to azslc2. Though without attaining it yet. As such, this could be a softer plan, than all-at-once, to render to goal and readvise after the first step: Update to v2 could use the normal branch & PR process. But I suggest the jump to "evo" being too massive, the v2 be conserved as legacy, and the new evo be continued on a forked repo.
I don't fully understand what you mean about the "lineage timeline", but overall it sounds like you are continuing to move in a good direction. Looking forward to discussing more during our call next week.
Something else to remember, not sure how applicable it is: Keep thinking about whether there are additional restrictions we can define that would further simplify the implementation. For example, if having functions defined inside the SRG namespace is problematic, I think it's okay to just say we don't support that. If nested namespaces don't work right, that's not so terrible. We can publish a list of caveats and known issues, if necessary to reduce the complexity of implementation.
Motivation
With the announcement of HLSL 2021 https://devblogs.microsoft.com/directx/announcing-hlsl-2021/ templates, operator overloading and bitfields have been introduced. We observe that it would be a substantial cost and time inertia to follow such impactful language evolutions in AZLSc. The question is: is there a way to remove the logic-heavy part of azslc (semantic analysis) to make it that azsl is transparently hlsl? That way future evolutions of HLSL as a language would naturally become immediately available, ideally simply by a package release of DXC.
Suggestion
One option that seems to me of least effort, would be to cut the edition process in 2 parts, the AZSL part that holds the resources, and the HLSL part that holds the code.
Concept prototype idea 1
If from that input:
We get that output:
Then we can save it to
inputs.hlsl
and extend it with follow up file:We note that the resource variables have changed names because of the mutations undergone in the process of SRG-erasure (transpilation from AZSL to HLSL). So it requires the programmers to take consciousness of the mutation scheme, and consult the input.hlsl to know what they have to work with.
Advantages
Inconvenients
Evolution idea 1.1
The problem is that the mutation can be platform specific, and can be azslc version dependent. Also, it can be unpredictable because of name collision avoidance. e.g.
SRG::m_uniforms
may becomeSRG_m_uniforms
orSRG_m_uniforms1
.Otherwise said, there is no specification guarantee on the rename scheme. To ease that issue, we can imagine an
__asm__
block scheme, with what historically was called "clobber" declarations to make the link between host language and DSL.Example of what it could look like:
Bear with me that the program is nonsensical. But the point is to illustrate what we lose and what we win. We win language involutivity, but we lose perfect integration with the azsl-declared resource. They need to be bridged in some way (somewhat akin to lambda capture), so that the access to the mutated symbols in the HLSL block can bind to their intended symbol.
Advantages
Inconvenients
using
directives)__hlsl__
block and repeat a short declaration.Concept idea 2
Strongly reduce the invasiveness of AZSL specific syntax constructs. Tending instead toward a decorated HLSL. The compiler would still need to exist to do reflection and resource registers assignation in each platform way. Also would still generate
option
androotconstant
variable getters. Would still need to accept non-HLSL blocks such as: static samplers with in-situ states declarations, or SRG frequencies and option fallback key. But the names would be expected to be stable since no flattening or scope mutation will happen. Client-site usage (later in code), will remain naturally compatible with the declaration.As per @santorac proposal:
Using an annotation, AZSLc2 would have to recognize that attribute to register resources instead of the
ShaderResourceGroup
block of today.Advantage
Inconvenient
Prototype
I (@siliconvoodoo) am forking the main repository to try this evolution here: https://github.com/SiliconStudio/o3de-azslc-evo
Findings
I see 3 pathways of implementation to the target:
Further
We can also decide to delete the
ShaderResourceGroupSemantic
syntax and integrate it to attributes as well:We'll note that those attributes are still oddly not compatible with DXC. Even with -HV 2021:
But it's reasonable as long as AZSLc2 swallows those attributes.
Desirable diet features
seenat
refer to https://github.com/o3de/o3de-azslc/wiki/Features#seenats This is a necessity for mathematically infallible symbol rename and migration. (the migration from SRG scopes to global, and some typealias/structs from function scopes to outter scope which was a bonus of azsl)
Maintaining this is the most costly because of its dependency to reliable lookup. Lookup depends on semantic contexts and requires understanding of scopes, type deduction, inheritance, function overloads, and overrides. Introduction of templates is hindered by the weight of updating all these mechanisms.
impacts:
srg-constants references mutations
in the original azsl source:
accesses to
sunDir
get mutated to their actual materialization in a generated constant buffer, as such:Finding the points of mutation requires the seenat system.
One way to do away with that problem is to adopt the
option
strategy which is to declare a static variable that is fetch from a function call. Refer to wiki features paragraph for illustration.rootconstant
mutationsazsl example source:
results in mutated references to:
Suggestion for solution: the variable is mutated at definition site to a self-initializing static. Therefore we could imagine that the name needn't change. That behavior seems like a conservative choice but it doesn't seem necessary, the original symbol name could probably be preserved. That would be consistent with the behavior for
option
s. And lift us from the need of iterating the references.--no-ms, --strip-unused-srg, --cb-body, --bindingdep
Mentioned in later paragraphs.
Packing
Feature to reflect
"constantByteOffset" "constantByteSize" "typeDimensions"
(document). It would be desirable to delete all the alignment computation code that serves as a support to RHI for buffer-as-byte understanding of where (at what offset) variables actually sit in the CB. This code is heavy and costs many tests, has cost long investigations of reverse engineering DXC, and multiplied by the combination of options for vulkan/dx/glsl rules.Dependencies: The pack computer relies on the type system. Because it accesses type class (user defined or fundamental, matrix or vector) and typeinfo (array or matrix dimensions), and sizeof. The type system can't work without symbol lookup system because they can be combined (UDT members, inheritance...), and typedefed.
Alternative: rely on a sort of reflective DXC API?
Constant folding
It would save us a small amount of code, but always a nice to add to the diet. Unfortunately at this point, it's important for array dimension reflection of SRG resources. Also for [[pad_to(N)]] feature, or option range, the thread count reflector (for metal), static sampler reflection.
Difficult features
--no-ms
Any route will at least diminish the robustness of the current approach: no more possible to check that X in X.Load() is a Texture2DMS type-referring symbol since it relies on Lookup facility, and typeof facility.
Yellow route: Totally unsupported. We won't have enough grammar power to work on AST level anymore.
Alternatives:
--cb-body
This behavior switch on the CLI activates a mode where generation of constant buffer takes a different form, as such: input:
output:
We note that again, in
MainPS
the references to external resources are mutated to a different symbol.SRG::color
becomes::SRG_Color
and more complicated, access touniforms
has be enriched with a subscript access.Solution: It seems like we could once again go with the
option
strategy, declare a static accessor variable with an initializer calling a generated getter function fetching the member in the generated ConstantBuffer corresponding member. As a matter of fact, since these srg-constant variables becomes immediately visible to the outter scope, they may even be declared as is, with the same original name. It will cause declaration vs access order problems though since their declaration site will be migrated all together into one location. We already have this problem for srg-constants. The static variable with initializer seem like a simple enough counter strategy for that.The [0] subscript can maybe be solved in the exact same way, let the references refer to a generated static variable, and initialize it with a fetcher function, the body of that fetcher function will possess the [0] subscript. This way we free the reference sites accross the program from need of awareness of this specificity.
--bindingdep
This system reflects the "participantsConstants" (in JSON) by "dependentFunctions" (entry points). (documentation in features page). It relies on the reference tracker to iterate on appearances through the program of external resources. Same core system than
--strip-unsued-srgs
as described in this picture. The code for the feature can be found here.We can either forgo of that facility, if we re-evaluate its necessity in sensitive platforms like vulkan or metal. Or we will need to find an alternative, using DXC internal API, maybe by analyzing remaining resources post optimization. Though I seem to recall optimization was not a factor, even on the contrary we needed to know the variables that should be there by contract, irrespective of potential dead-variable optimizations. Maybe leveraging clang (or the clang in DXC) if push comes to shove.
--strip-unused-srgs
Reference doc: https://github.com/o3de/o3de-azslc/wiki/Features#strip-unused-srgs This relies on the homonym visitor system to visit the seenats of each resource inside an SRG. It's the same problem as --bindingdep since it relies on the same system.
Any route: will rid us of ability to iterate over seenats since the point of the evolution RFC, is to remove the complexity involved in the seenat system.
Alternatives: Drop that feature? It seems undesirable in raytracing contexts. Maybe we can hack an artificial resource-toucher to force DXC to not optimize out resources (or propose a flag)? Or if we do the clang explorer for --bindingdep it will be factorized for that feature too.