pema99 / UnityShaderParser

BSD 3-Clause "New" or "Revised" License
17 stars 2 forks source link

Using this Library to parse and analyze shaders, how does one properly use this tool? #3

Closed HB-Stratos closed 2 months ago

HB-Stratos commented 2 months ago

Hi there! First of all, let me note that I've never dealt with a parser like this before, so I may miss things that may be obvious.

So what I've been trying to do is this: Read a compute shader, find a struct of a specific name, and determine the size of the compute buffer stride I will need to fit this struct in. As I'm building a manager for potentially artist-written compute shaders, I can't rely on manually cloning the struct on c# side to do reflection on. I also can't use what limited reflection Unity offers in ShaderUtil because ComputeShaders are not Shaders in unity.

After many, many hours of trying, I've kindof managed to hack together something that does most of what I want, but I can't help but notice that it is quite clunky and likely not how this library was intended to be used. I have read the tests and they have helped. Sadly they appear to not dive deep into trees and convert nodes, or if they do I haven't found it myself. That is why I have come here to ask how I would accomplish my goal in line with how the library was meant to be used.

At the risk of posting some very messy code, here's how I'm currently doing things:

Expand ```c# { //TODO make this dynamic string shaderFilePath = "Assets/WorkingDir/ParticleSystem2/TestParticle.compute"; string shaderFileContent = ""; if (File.Exists(shaderFilePath)) shaderFileContent = File.ReadAllText(shaderFilePath); else Debug.LogWarning("File not found at: " + shaderFilePath); var hlslParserConfig = new HLSLParserConfig() { PreProcessorMode = PreProcessorMode.StripDirectives }; var hlslTokens = HLSLLexer.Lex(shaderFileContent, null, null, true, out _); var hlslParsed = HLSLParser.ParseTopLevelDeclarations( hlslTokens, hlslParserConfig, out _, out _ ); var hlslStructs = hlslParsed .Where(node => node.GetType() == typeof(StructDefinitionNode)) .ToList(); var hlslParticleStruct = hlslStructs .Where(node => node.Tokens.First(token => token.Kind == UnityShaderParser.HLSL.TokenKind.IdentifierToken ).Identifier == "Particle" ) .ToArray()[0]; var hlslParticleStructKeywords = hlslParticleStruct .Tokens.Where(token => Enum.GetName(typeof(UnityShaderParser.HLSL.TokenKind), token.Kind) .Contains("Keyword") ) .Skip(1) // skip the struct keyword itself .ToArray(); int bufferSize = 0; foreach (Token token in hlslParticleStructKeywords) { bufferSize += GetShaderVarSize(token); } if (bufferSize % 4 != 0) Debug.LogWarning("Generated buffer size is not a multiple of 4: " + bufferSize); // hlslParsed = HLSLParser.ParseStatements(hlslParsed, hlslParserConfig, out _, out _); // var hlslPrinter = new HLSLPrinter(); // hlslPrinter.Visit(hlslParsed[0]); // hlslPrinter. // var test = hlslPrinter.Text; } int GetShaderVarSize(Token token) { string tokenName = Enum.GetName(typeof(UnityShaderParser.HLSL.TokenKind), token.Kind) .Replace("Keyword", ""); var dimensionRegex = new Regex(@"(? sizeInBytes = new Dictionary { { "int", 32 / 8 }, { "uint", 32 / 8 }, { "dword", 32 / 8 }, { "half", 16 / 8 }, { "float", 32 / 8 }, { "double", 64 / 8 }, }; ```

You can also find the entire project I am currently working on here: https://github.com/HB-Stratos/KSPShaderDev

Thanks in advance for any help and tips, and for writing this tool in the first place!

P.S.: I've also loaded your library as a local unity package and git submodule. It kinda works, but it required me to append a ~ to the end of the .Experiments and .Tests directories so they do not get loaded by unity as they throw a lot of errors. Unfortunately this means that my local git is different than the one on here, which will likely break if I try to update this repo, or if anyone else tries to clone my project repo.

pema99 commented 2 months ago

Unless you need more granular control, one intended way to use the library is using the entry points found in ShaderParser.cs.

For example var decls = ShaderParser.ParseTopLevelDeclarations(source, config); instead of manually invoking HLSLParser and HLSLLexer.

The main gnarly parts of your code are attempts at analyzing the parsed syntax tree. The intended way to walk a syntax tree for analysis is by making a type that inherits from HLSLSyntaxVisitor. It implements the Visitor pattern in similar to Roslyn (google these terms if you don't know what I'm talking about). The API is in general inspired by Roslyn.

Here's how I would write your program (assuming I understand it correctly):

class StructSizeVisitor : HLSLSyntaxVisitor
{
    // Output
    public int ParticleStructSize = 0;

    // Helpers
    int GetScalarTypeSize(ScalarType scalarType)
    {
        switch (scalarType)
        {
            case ScalarType.Int:
            case ScalarType.Uint:
            case ScalarType.Float:
                return 4;
            case ScalarType.Half:
                return 2;
            case ScalarType.Double:
                return 8;
            default:
                return 0; // Add whichever types you care about
        }
    }

    // Visitor impl
    public override void VisitStructTypeNode(StructTypeNode node)
    {
        // Only care about particle struct
        if (node.Name.GetName() == "Particle" && ParticleStructSize == 0)
        {
            foreach (var field in node.Fields)
            {
                switch (field.Kind)
                {
                    case ScalarTypeNode scalar:
                        ParticleStructSize += GetScalarTypeSize(scalar.Kind);
                        break;
                    case VectorTypeNode vector:
                        ParticleStructSize += vector.Dimension * GetScalarTypeSize(vector.Kind);
                        break;
                    case MatrixTypeNode matrix:
                        ParticleStructSize += matrix.FirstDimension * matrix.SecondDimension * GetScalarTypeSize(matrix.Kind);
                        break;
                    default:
                        break;
                }
            }
        }
        else
        {
            base.VisitStructTypeNode(node);
        }
    }
}

class Program
{
    public static void Main()
    {
        string shaderFileContent = ...;
        shaderFileContent = File.ReadAllText(shaderFilePath);

        var decls = ShaderParser.ParseTopLevelDeclarations(shaderFileContent,
            new HLSLParserConfig() { PreProcessorMode = PreProcessorMode.StripDirectives });

        var visitor = new StructSizeVisitor();
        visitor.VisitMany(decls);

        Console.WriteLine(visitor.ParticleStructSize);
    }
}

I've also loaded your library as a local unity package and git submodule. It kinda works, but it required me to append a ~ to the end of the .Experiments and .Tests directories so they do not get loaded by unity as they throw a lot of errors.

You can use a release instead of embedding the git repo directly. I distribute the library both as multiple files (without all the .Experiments stuff etc.) and as a single file for ease of use. For the Unity use case, you could just download that single script file use it - it contains the entire library. While it sort of works, the repo isn't intended for direct use as a submodule.

HB-Stratos commented 2 months ago

This looks orders of magnitude more reasonable than what I have been able to create. Thank you so much for your response! I might adjust it a little bit to allow for nested structs and more variable types, but with this implementation I no longer have to do any ugly hacks. Side question: Does the parser ignore comments entirely, or do they remain accessible in some way? As I want to allow artists to author hlsl files, some minor declarations at the top might be needed. They could be declared as variables, but that feels a bit strange and I would have to hope the compiler strips unused ones. As for gitmodules, the advantage of using one in this case would be that updating the library would be as trivial as git submodule update --recursive --remote. But having to rename folders breaks that anyways, so unless I can somehow add a release as a submodule (don't think that's how that works), I'll likely have to just manually update the files whenever something is updated.

pema99 commented 2 months ago

Side question: Does the parser ignore comments entirely, or do they remain accessible in some way? As I want to allow artists to author hlsl files, some minor declarations at the top might be needed. They could be declared as variables, but that feels a bit strange and I would have to hope the compiler strips unused ones.

The parser (well, the lexer, actually) ignores comments entirely, yes. I wanted to add them onto tokens as metadata/trivia at some point, but I never got around to doing it.

If you want annotating comments, you could always just scan the source code for them yourself with a regex or something similar. If you need comments associated with specific pieces of syntax, the tokens in the syntax tree contain info about where in the code they are located, which you could use to match them up with comments.

Another option is to use the libraries built in preprocessor - either custom pragmas or #define. Perhaps something like

#pragma MyCoolAnnotation
struct foo {};

or

#define FOO_META MyCoolAnnotation
struct foo {};

Alternatively, depending on what you are trying to do, HLSL actually has a little-known built in syntax for annotating variables, which the parser supports: https://learn.microsoft.com/en-us/windows/win32/direct3d11/d3d11-effect-annotation-syntax

I'm open to contributions but it's unlikely that I'll be adding comment parsing any time in the near future.

I'll likely have to just manually update the files whenever something is updated.

Yeah. The intention is for users of the library to vendor it. This library isn't Unity-specific, it's for use in any .NET context, so I don't follow Unity's package structure. I might make it more amenable to being used as a submodule later if I can find the time and motivation, we'll see.

HB-Stratos commented 2 months ago

The annotation syntax actually looks really promising. I'll have to play around with it and see how I can access it with the parser, thanks for pointing me that way!

As for submodules and packaging: All that was needed to make it work perfectly fine as a submodule was to prevent unity from compiling the tests and experiments folders by appending a ~ to their name. As for making it a unity package, that's trivially easy with two files, you can see how I did it on the git repo I linked in the top post.

pema99 commented 2 months ago

I'll keep it in mind. Maybe in the future.

FYI, I added a few more examples to the README just now.

pema99 commented 2 months ago

Just reopen the issue if you encounter additional problems.

HB-Stratos commented 2 months ago

Awesome, thank you for more examples! One thing I am wondering, I was trying to look at the annotations, but I can not locate a VisitAnnotationNode or similar function. Did I miss something or do I have to access annotations differently?

pema99 commented 2 months ago

Annotations are represented as a field on variable declarator nodes https://github.com/pema99/UnityShaderParser/blob/85122ef88a417c4fae13bfa21656f23433d07774/UnityShaderParser/HLSL/HLSLSyntaxElements.cs#L754

HB-Stratos commented 1 month ago

Hey! I've been using the shader parser with success, however I've recently pulled some of my code into an include file, and changed the preprocessor from StripDirectives to ExpandAll. However, it seems to handle relative paths strangely. The analyzed shader makes reference to #include "../ComputeShaderLearning/WhiteNoise.cginc" , which errors out with DirectoryNotFoundException: Could not find a part of the path "H:\ComputeShaderLearning\WhiteNoise.cginc". System.IO.FileStream..ctor (System.String path, System.IO.FileMode mode, System.IO.FileAccess access, [,,,], H being the drive the files lay on. It seems to assume all paths are absolute paths, while at least unity can read the relative path perfectly well. Is there any way around this?

pema99 commented 1 month ago

You can customize the include resolving logic with this interface. Maybe you can figure it out. https://github.com/pema99/UnityShaderParser/blob/master/UnityShaderParser/HLSL/PreProcessor/IPreProcessorIncludeResolver.cs

The preprocessor calls it here https://github.com/pema99/UnityShaderParser/blob/master/UnityShaderParser/HLSL/PreProcessor/HLSLPreProcessor.cs#L167

I might look into reproing with the example you provided, maybe I don't handle .. correctly,

pema99 commented 1 month ago

Oh, actually, one thing that came to mind: If the shader file you are parsing isn't in the current working directory, you should set this property in the parser config https://github.com/pema99/UnityShaderParser/blob/master/UnityShaderParser/HLSL/HLSLParser.cs#L15

Otherwise the parser has no way of knowing where the file is, since you pass it text directly and not a path, so relative paths will be wrong.

HB-Stratos commented 1 month ago

I think base path will end up being the solution I need, that is currently not set and I'm feeding it a string. Current working dir is probably the unity base path, not the subdir path. So I'll need to set that. I'll try that out on monday when I'm back home, thank you!