llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.24k stars 12.07k forks source link

[mlir][spirv] Modernize op assembly formats and syntax docs #73359

Open kuhar opened 12 months ago

kuhar commented 12 months ago

The SPIR-V dialect predates the assemblyFormat definitions in TableGen and some of the SPRI-V ops needlessly use handwritten parsers. We should go over the ops in the dialect and update to assemblyFormat wherever possible.

With assemblyFormat in place, we can also drop the handwritten syntax docs, since the latter is redundant in the presence of assemblyFormat. This way the two can't possibly diverge and become confusing. This is similar to the cleanup done in https://github.com/llvm/llvm-project/pull/73343.

llvmbot commented 12 months ago

@llvm/issue-subscribers-mlir-spirv

Author: Jakub Kuderski (kuhar)

The SPIR-V dialect predates the `assemblyFormat` definitions in TableGen and some of the SPRI-V ops needlessly use handwritten parsers. We should go over the ops in the dialect and update to `assemblyFormat` wherever possible. With `assemblyFormat` in place, we can also drop the handwritten syntax docs, since the latter is redundant in the presence of `assemblyFormat`. This way the two can't possibly diverge and become confusing. This is similar to the cleanup done in https://github.com/llvm/llvm-project/pull/73343.
llvmbot commented 12 months ago

Hi!

This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:

1) Assign the issue to you. 2) Fix the issue locally. 3) Run the test suite locally. 3.1) Remember that the subdirectories under test/ create fine-grained testing targets, so you can e.g. use make check-clang-ast to only run Clang's AST tests. 4) Create a git commit 5) Run git clang-format HEAD~1 to format your changes. 6) Submit the patch to Phabricator. 6.1) Detailed instructions can be found here

For more instructions on how to submit a patch to LLVM, see our documentation.

If you have any further questions about this issue, don't hesitate to ask via a comment on this Github issue.

llvmbot commented 12 months ago

@llvm/issue-subscribers-good-first-issue

Author: Jakub Kuderski (kuhar)

The SPIR-V dialect predates the `assemblyFormat` definitions in TableGen and some of the SPRI-V ops needlessly use handwritten parsers. We should go over the ops in the dialect and update to `assemblyFormat` wherever possible. With `assemblyFormat` in place, we can also drop the handwritten syntax docs, since the latter is redundant in the presence of `assemblyFormat`. This way the two can't possibly diverge and become confusing. This is similar to the cleanup done in https://github.com/llvm/llvm-project/pull/73343.
alexbeloi commented 11 months ago

Hi, I'd like to work on this as my first issue

kuhar commented 11 months ago

Sure @alexbeloi. Feel free to ping me if you need some pointers. Also happy to have a chat on discord or a call.

alexbeloi commented 11 months ago

I don't see any duplicate Syntax: in the SPIRV dialect documentation, but there are a bunch in Math/IR/MathOps.td like atan and others, so I will fix those.

kuhar commented 11 months ago

I don't see any duplicate Syntax: in the SPIRV dialect documentation, but there are a bunch in Math/IR/MathOps.td like atan and others, so I will fix those.

@alexbeloi These duplicate ones have already been removed from the SPIR-V dialect: https://github.com/llvm/llvm-project/pull/73386. The remaining part of the task is to use let assemblyFormat = tablegen syntax definition where we currently rely on hand-written parsers.

You can find these by running something like grep -rn 'Op::print(' mlir/lib/Dialect/SPIRV/IR. Some of these are complicated enough to justify the hand-written parsers and printers, but we also have a bunch of old ops that have simple syntax and should be upgraded to assemblyFormat (even if it requires minor syntax changes like using the <attribute> syntax instead of "attribute").

alexbeloi commented 11 months ago

Hey Jakub, I'm updating some of the SPIRV parsing to use assemblyFormat and have a couple questions.

It seems like the change may require more than just a small "attribute" -> <attribute> kind of change. I wanted to check if this would be ok or too much or I should just ignore these cases

It seems that assemblyFormat has some general requirements (AFAIU) to generate the builders

  1. attr-dict is required
  2. operands, results (either individually or with the builtin operands and results keywords) and (sometimes?) their types are required

For example syntax:

%0 = spirv.AtomicAnd "Device" "None" %ptr, %value : !spirv.ptr<i32, StorageBuffer>

Just replacing the quotes with angle brackets doesn't work, it seems to require something like

let assemblyFormat = [{
    `<` $memory_scope `>` `<` $semantics `>` operands attr-dict `:`
    functional-type(operands, results)
}];

This would be added here, to replace the print/parse here.

A bunch of handwritten parsers don't use $result or its type for printing, but the compiler/table-gen complains if it's not included in the assemblyFormat with this error

error: type of result #0, named 'result', is not buildable and a buildable type cannot be inferred
kuhar commented 11 months ago

@alexbeloi

It seems like the change may require more than just a small "attribute" -> <attribute> kind of change. I wanted to check if this would be ok or too much or I should just ignore these cases

There are a few options here. In general, we should be fine with updating op syntax as long as the new syntax is not too different and we converge towards uniformity within the dialect.

It seems that assemblyFormat has some general requirements (AFAIU) to generate the builders

  1. attr-dict is required
  2. operands, results (either individually or with the builtin operands and results keywords) and (sometimes?) their types are required

The result type can often be inferred from ODS type constraints, e.g., AllTypesMatch<[...]>, SameOperandsAndResultTypes -- you can grep for these in mlir/include/mlir/Dialect/SPIRV/IR.

For example syntax:

%0 = spirv.AtomicAnd "Device" "None" %ptr, %value : !spirv.ptr<i32, StorageBuffer>

Just replacing the quotes with angle brackets doesn't work, it seems to require something like

let assemblyFormat = [{
    `<` $memory_scope `>` `<` $semantics `>` operands attr-dict `:`
    functional-type(operands, results)
}];

We have some examples of ops with attributes that use assemblyFormat, one recent one that comes to mind is:

class SPIRV_IntegerDotProductBinaryOp<string mnemonic,
                                      list<Trait> traits = []> :
      SPIRV_IntegerDotProductOp<mnemonic,
        !listconcat(traits, [AllTypesMatch<["vector1", "vector2"]>])> {
  let arguments = (ins
    SPIRV_ScalarOrVectorOf<SPIRV_Integer>:$vector1,
    SPIRV_ScalarOrVectorOf<SPIRV_Integer>:$vector2,
    OptionalAttr<SPIRV_PackedVectorFormatAttr>:$format
  );

  let assemblyFormat = [{
    $vector1 `,` $vector2 ( `,` $format^ )? attr-dict `:`
      type($vector1) `->` type($result)
  }];
}

Here the format is optional and uses the <attr-kind> syntax. There's also a way to tell tablegen that your operand uses custom syntax, you can grep other dialects for custom<.+> and hasCustomAssemblyFormat, but I'd rather avoid that if we can.

Also, I'm personally not a fan of the parentheses introduced by functional-type(operands, results), I find that it's more readable to just do:

type($my-operand) `->` type($my-result)

I'd argue that in the specific case of spirv.AtomicAnd it'd be better to spell out the type of the value operand (same as the return type):

%0 = spirv.AtomicAnd <Device> <None> %ptr, %value : !spirv.ptr<i32, StorageBuffer> -> i32
kuhar commented 11 months ago

gpu.subgroup_reduce is an example of an op that puts its attribute first and uses assembly format: https://github.com/llvm/llvm-project/blob/b26c0ed93a1b735396f3b167ea47d82357468c96/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td#L1026-L1056

hahacyd commented 3 weeks ago

I'm currently looking for a good first issue to work on and noticed this one. It seems like the issue has already been resolved. If everything is indeed addressed, could you please consider closing the issue? This will help newcomers like me find open issues to contribute to.

alexbeloi commented 3 weeks ago

@hahacyd I think there are still a lot of places that need updating. You can take this issue over if you'd like.

hahacyd commented 3 weeks ago

@alexbeloi Thank you for your response! I apologize for mistakenly thinking the issue was resolved because I noticed it was unassigned. I still need to gain a deeper understanding of this issue, so I won't be taking it over for now. However, as I learn more, I'll be sure to offer help where I can. Thanks again for your guidance!

kuhar commented 3 weeks ago

@hahacyd yes, this is far from being done. Feel free to ask questions if you are not sure about something along the way.

hahacyd commented 2 weeks ago

Hi, I found that most operations in SPIRVGroupOps.td both have assemblyFormat and hand-writen parser&printer(in the corresponding cxx files: GroupOps.cpp), I think as for such operations, I can just remove the redundant hand-writen parser & printer (on the basis of that they both have same output). ps: I'd be happy to chat on discord.

kuhar commented 2 weeks ago

SGTM, @hahacyd. Also feel free to ping me on discord if you have questions.