xdslproject / xdsl

A Python Compiler Design Toolkit
Other
223 stars 61 forks source link

xDSL can not parse a DenseAttr with Hex values #1971

Closed JosseVanDelm closed 5 months ago

JosseVanDelm commented 6 months ago

Hi!

I'm trying to compile some of the MLPerf Tiny benchmarks. During this, at some point in compilation, the following IR is generated for the weight values, which tend to be quite a bit larger in shape and size than the following minimal error-throwing example here:

%0 = "arith.constant"() <{value = dense<"0x82F5AB00"> : tensor<1xi32>}> : () -> tensor<1xi32>

mlir-opt is able to parse this just fine:

module {
  %cst = arith.constant dense<11269506> : tensor<1xi32>
}

But xdsl-opt is a bit confused :cry:

``` Traceback (most recent call last): File "/home/josse/phd/xdsl/xdsl/tools/command_line_tool.py", line 569, in parse_chunk return self.available_frontends[file_extension](chunk) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/josse/phd/xdsl/xdsl/tools/command_line_tool.py", line 557, in parse_mlir ).parse_module(not self.args.no_implicit_module) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/josse/phd/xdsl/xdsl/parser/core.py", line 127, in parse_module if (parsed_op := self.parse_optional_operation()) is None: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/josse/phd/xdsl/xdsl/parser/core.py", line 658, in parse_optional_operation return self.parse_operation() ^^^^^^^^^^^^^^^^^^^^^^ File "/home/josse/phd/xdsl/xdsl/parser/core.py", line 695, in parse_operation op = self._parse_generic_operation(op_type) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/josse/phd/xdsl/xdsl/parser/core.py", line 837, in _parse_generic_operation properties = self.parse_optional_properties_dict() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/josse/phd/xdsl/xdsl/parser/core.py", line 777, in parse_optional_properties_dict entries = self.parse_comma_separated_list( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/josse/phd/xdsl/xdsl/parser/base_parser.py", line 215, in parse_comma_separated_list elems = [parse()] ^^^^^^^ File "/home/josse/phd/xdsl/xdsl/parser/attribute_parser.py", line 165, in _parse_attribute_entry return name, self.parse_attribute() ^^^^^^^^^^^^^^^^^^^^^^ File "/home/josse/phd/xdsl/xdsl/parser/attribute_parser.py", line 143, in parse_attribute return self.expect(self.parse_optional_attribute, "attribute expected") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/josse/phd/xdsl/xdsl/parser/base_parser.py", line 179, in expect res = try_parse() ^^^^^^^^^^^ File "/home/josse/phd/xdsl/xdsl/parser/attribute_parser.py", line 128, in parse_optional_attribute return self._parse_optional_builtin_attr() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/josse/phd/xdsl/xdsl/parser/attribute_parser.py", line 581, in _parse_optional_builtin_attr if (val := attr_parser()) is not None: ^^^^^^^^^^^^^ File "/home/josse/phd/xdsl/xdsl/parser/attribute_parser.py", line 644, in _parse_optional_builtin_parametrized_attr return parsers[name.text](name) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/josse/phd/xdsl/xdsl/parser/attribute_parser.py", line 659, in _parse_builtin_dense_attr values, shape = self._parse_tensor_literal() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/josse/phd/xdsl/xdsl/parser/attribute_parser.py", line 885, in _parse_tensor_literal element = self._parse_tensor_literal_element() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/josse/phd/xdsl/xdsl/parser/attribute_parser.py", line 849, in _parse_tensor_literal_element self.raise_error("Expected either a float, integer, or complex literal") File "/home/josse/phd/xdsl/xdsl/parser/base_parser.py", line 100, in raise_error raise ParseError(at_position, msg) xdsl.utils.exceptions.ParseError: hex_attr.mlir:1:44 %0 = "arith.constant"() <{value = dense<"0x82F5AB00"> : tensor<1xi32>}> : () -> tensor<1xi32> ^^^^^^^^^^^^ Expected either a float, integer, or complex literal The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/josse/.local/bin/xdsl-opt", line 8, in sys.exit(main()) ^^^^^^ File "/home/josse/phd/xdsl/xdsl/tools/xdsl_opt.py", line 5, in main xDSLOptMain().run() File "/home/josse/phd/xdsl/xdsl/xdsl_opt_main.py", line 68, in run module = self.parse_chunk(chunk, file_extension) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/josse/phd/xdsl/xdsl/tools/command_line_tool.py", line 574, in parse_chunk raise Exception("Failed to parse:\n" + e.with_context()) from e ```
Exception: Failed to parse:
hex_attr.mlir:1:44
    %0 = "arith.constant"() <{value = dense<"0x82F5AB00"> : tensor<1xi32>}> : () -> tensor<1xi32>
                                            ^^^^^^^^^^^^
                                            Expected either a float, integer, or complex literal

Also, I suspect that this is an issue also because of the size of this DenseAttr. For the smaller cases it seems that mlir-opt is printing the values one by one with [s,]s and ,s , but for larger sizes it is too lazy to do so, and now I can not parse this my file :upside_down_face: (otherwise I could just round-trip and print it as a huge <[1, 2 , 3, 4]>

I'll look into a PR to solve this tomorrow, but any pointers or help on how to solve this are very welcome!

superlopuh commented 6 months ago

Looks like we need to add a new case here:

        # Integer and float case
        if self._current_token.kind == Token.Kind.FLOAT_LIT:
            token = self._consume_token(Token.Kind.FLOAT_LIT)
            value = token.get_float_value()
        elif self._current_token.kind == Token.Kind.INTEGER_LIT:
            token = self._consume_token(Token.Kind.INTEGER_LIT)
            value = token.get_int_value()
        else:
            self.raise_error("Expected either a float, integer, or complex literal")

I'm guessing mlir is happy to process strings in this position

math-fehr commented 6 months ago

Actually, the code that needs to be changed is near:

        shape: list[int] | None
        if self._current_token.text == ">":
            values, shape = [], None
        else:
            values, shape = self._parse_tensor_literal()
        self.parse_punctuation(">", " in dense attribute")

The dense parameters are either a single string containing an hex, or a tensor literal. It is not possible to parse something like ["0xff", "0xef"] as far as I know

JosseVanDelm commented 5 months ago

solved by #1974