Fleshing Ideas - Githubissues

Jack-Clark commented 2 years ago

Here are a collection of ideas for fleshing that we should explore. Each issue can be expanded into a separate issue once it is more concrete and being worked on.

Generate paths in a more sophisticated way

Currently, when generating a random path, we make a uniformly random choice about which neighbour of a node to visit. This can make it less likely that we will generate interesting paths, particularly for CFGs that have lots of short paths to terminating blocks. Furthermore, the fleshing tool operates without memory, so we often generate the same path when the tool is run repeatedly. It would be good to avoid repeatedly generating the same path.

One option to try to increase the likelihood of generating interesting paths is to bias the random choice at each node based on some definition of interestingness. One thought I had was to use the height of a node in a DFS of the CFG to bias the random choice towards visiting nodes with larger heights. This way we will try to go deeper into the tree.

I also wonder how expensive it would be to exhaustively explore all paths in the CFG (up to some maximum number of cycles in the path)? How large/complex do the CFGs generated get?

Flesh out a particular path

Given a predetermined path, flesh out that path (perhaps checking it is a valid path first). This could be useful for reduction and debugging.

Configurable flesing

We will add features to the flesher (many ideas below), however to aid reduction and debugging, it would be good to ensure that each feature can be easily enabled/disabled.

Exploit knowledge of CFG

Inject statically known loop upper bounds for loop unrolling
Inject statically known unreachability information for blocks that we know are unreachable. For example, if we know that the path ends at a particular terminating block, then the other terminating blocks can be marked as unreachable.

Memory

Use different types of memory to store the direction and output indices, as well as the actual buffers themselves.

SSA

Use SSA instructions for the index operations.

Multiple threads

Have multiple threads (either in the same workgroup or across workgroups) execute the same fleshed path.
Have multiple threads execute different fleshed paths. Without barrier synchronisation, would this imply the threads must belong to different workgroups?
Use barrier synchronisation within a workgroup to have different threads execute different fleshed paths. My understanding is that the main challenge here is to ensure uniformity of control flow at barriers. An extension is to have threads switch the path they are following at barriers. Relevant papers: here and here.

Jack-Clark commented 2 years ago

An additional idea is to insert function calls to other functions with verified CFGs.

afd commented 2 years ago

Yes, that's a nice idea.

Also, for barriers: we can avoid problems of uniformity by (a) having just one thread (maybe a bit boring but may as well try) and (b) having all threads follow the same path (either by having them use a single directions array, or by giving them teach their own directions array, or by giving them a single directions array but their own starting index into this array (which at runtime we make all the same)).

Jack-Clark commented 1 year ago

Here are some additional ideas to explore if we wish to continue work on fleshing:

Use a single directions array that will contain the directions data for all blocks. This will avoid the issue of exceeding the allowed number of buffers on some platforms, as well as simplifying the generated SPIR-V.
Use storage buffers instead of uniform buffers. Uniform buffers were intended to be read-only, however, up until SPIR-V 1.3 they could be written to.
Add path tracking to blocks regardless of whether they are visited or not. This will allow us to diagnose errors more easily, albeit at the expense of slightly more complex/larger SPIR-V. Perhaps this could be an optional feature that is most useful in debug mode?
Have the direction of unvisited conditional blocks depend on directions data. DXC rejects HLSL that has infinite loops that can be detected statically, so this would make it harder for DXC to reject some code.
Have a configuration file to control fleshing. This would control the fleshing features that are enabled, and could even predetermine the paths taken. This should be designed to facilitate easy reduction of the SPIR-V.
A reducer that operates on the configuration file. It could be either a reducer with custom logic or a line based reducer.

mc-imperial / spirv-control-flow

Fleshing Ideas #30