o3de / o3de-azslc

Amazon Shader Language (AZSL) Compiler
Other
23 stars 14 forks source link

Auto option ranks #84

Closed siliconvoodoo closed 1 year ago

siliconvoodoo commented 1 year ago

This is the branch to introduce static analysis for option ranks.

Here is a little history of the development of that branch.

Idea

The main concept is to start from options's seenat (references throughout the code). From a seenat, we determine its AST node, and find out if it's connected to a subtree. Then we walk that tree recursively to accumulate the cost of its nodes. image

Each time we encounter a function call node, we lookup that function by reconstructing the scope context at that position, and using the Lookup function. We establish the cost of the function, recursively and accumulate and cache function costs to memoize. (this is no typo, it's called memoization).

Problems

binary ops

We quickly encountered a problem of limitations of the type system by testing on StandardPBR, some function's overloads couldn't be decided, because of use of binary expressions:

This exception occurs and is interceptible and thus visible when run under the debugger: image

This is meant to be continuable, but will result in an internal type which can cause error #41 at times. These are example of the contexts where StandardPBR hit such limit: image

or image

or image

So, I implemented the binary expression solver for the type system to enrich the cost result.

Methods

Method tracing is harder than free function calls because the starting scope is harder to figure out. I had to refactor the cost analyzer structure by taking inspiration from what's happening in the seenat system. I eventually found a streamlined implementation that only takes 3 supplementary lines of code by exploiting the TypeofExpr system. It supports a.b() but also a.A::b() or a.::A::b() as well.

Scopes

I got a problem with the scope to IdentifierUID (symbol) map. Until now, the system to find out about the symbol covering a token space, was used by the --bindingdep system. It only used functions to populate that map. The invariant was that functions never overloap, and we never have functions inside functions. Therefore the set of intervals is disjoint. It allowed for fast and simple queries using lower_bound (Infimum more exactly). In our current case, we needed full scope reconstruction including inner scopes. Therefore, the invariant was broken and I had to extend the map with a new class, called IntervalCollection which is much more powerful and can now support queries for the closest scope even in case of overlaps. This is based on the mathematical idea that scopes (intervals) activated by a point are the set intersection of the set_of_intervals_starting_earlier and set_of_intervals_ending_later. I tried to early-optimize that function as much as possible by avoiding temporary allocations, but set is a node based container so allocations are legion. We may recourse to Howard Hinnant stack allocator later. That said, my benchmark found no measurable impact from that new system, this is the comparative run on StandardPBR:

image

we're probably good for now.

Results of new features

Between a run with only the function follower, and the full system as it is now, we have a richer and better tracking of option costs, this is the difference between the early implementation and the last one with method follow and binary expressions:

image

DirectionalLights is radically different notably.

This is it, the rest is in the commit messages and the code itself.

Bests.