briot / tree-sitter-ada

Ada grammar for tree-sitter
MIT License
21 stars 5 forks source link

expression vs _subtype_indication in discrete_choice #6

Closed brownts closed 1 year ago

brownts commented 1 year ago

Hi @briot:

I'm seeing an issue with parses involving discrete_choice when the value is an identifier, selected_component, character_literal, etc. These are being incorrectly parsed as _subtype_indication when they should be parsed as an expression. The following simple example should demonstrate the problem:

with Ada.Text_IO; use Ada.Text_IO;

procedure Test3 is
   type Color_Type is (Red, Green, Blue, 'W');
   Color : Color_Type := 'W';
begin
   case Color is
      when Red           => Put_Line ("Red");
      when Green .. Blue => Put_Line ("Green or Blue");
      when 'W'           => Put_Line ("White");
   end case;
end Test3;

The interesting part of the parse tree involves the discrete_choice of the case statement. Both the "Red" identifier and "W" character_literal are being parsed as _subtype_indication, whereas "Green" and "Blue" both show correctly as an expression. I believe both "Red" and "W" should be parsed as expressions too. A character_literal isn't even a valid subtype name.

          (case_statement_alternative [7, 6] - [7, 45]
            (discrete_choice_list [7, 11] - [7, 14]
              (discrete_choice [7, 11] - [7, 14]
                subtype_mark: (identifier [7, 11] - [7, 14])))
...
          (case_statement_alternative [8, 6] - [8, 55]
            (discrete_choice_list [8, 11] - [8, 24]
              (discrete_choice [8, 11] - [8, 24]
                (range_g [8, 11] - [8, 24]
                  (term [8, 11] - [8, 16]
                    name: (identifier [8, 11] - [8, 16]))
                  (term [8, 20] - [8, 24]
                    name: (identifier [8, 20] - [8, 24])))))
...
          (case_statement_alternative [9, 6] - [9, 47]
            (discrete_choice_list [9, 11] - [9, 14]
              (discrete_choice [9, 11] - [9, 14]
                subtype_mark: (character_literal [9, 11] - [9, 14])))

I think the probability that a discrete_choice will contain an expression is much higher than it containing a _subtype_indication and likely should be given higher priority. This issue manifests in many places where a discrete_choice is used, not just this one example. Yet another example is an enumeration_representation_clause where all the enumeration identifiers are parsed as _subtype_indication.

parse_tree.txt

briot commented 1 year ago

I have tried to reproduce your issue with the current parser on master, but there is no subtype_indication in there. Are you using the latest version (the recent changes are unrelated, so it has been several months since the output of this test would have changed).

Here is the new test I added:

declare
   type Color_Type is (Red, 'W');
   Color : Color_Type := 'W';
begin
   case Color is
      when Red => null;
      when 'W' => null;
   end case;
end;

--------------------------------------------------------------------------------
(compilation
  (compilation_unit
    (block_statement
      (non_empty_declarative_part
        (full_type_declaration
          (identifier)
          (enumeration_type_definition
            (identifier)
            (character_literal)))
        (object_declaration
          (identifier)
          (identifier)
          (expression
            (term
              (character_literal)))))
      (handled_sequence_of_statements
        (case_statement
          (expression
            (term
              (identifier)))
          (case_statement_alternative
            (discrete_choice_list
              (discrete_choice
                (identifier)))
            (null_statement))
          (case_statement_alternative
            (discrete_choice_list
              (discrete_choice
                (character_literal)))
            (null_statement)))))))
brownts commented 1 year ago

The subtype_indication will not show up in the parse tree because it is hidden, however if this were to have been parsed as an expression, the identifier and character_literal would have been wrapped in an expression (e.g., "(expression (term (identifier)))"), which they are not.

This is even more evident if you add field names to your tests. When you do that, you will see the "subtype_mark" field shows up in the discrete_choice. I use the "subtype_mark" field name in my highlighting rules to identify and properly highlight types in the language. This doesn't work with the discrete_choice as it's incorrectly parsing non-types this way.

In the following, I have modified your expected results to add in the field names (and your test still passes) and you will see the "subtype_mark" field name appears (which comes from the subtype_indication).

(compilation
  (compilation_unit
    (block_statement
      (non_empty_declarative_part
        (full_type_declaration
          (identifier)
          (enumeration_type_definition
            (identifier)
            (character_literal)))
        (object_declaration
          name: (identifier)
          subtype_mark: (identifier)
          (expression
            (term
              name: (character_literal)))))
      (handled_sequence_of_statements
        (case_statement
          (expression
            (term
              name: (identifier)))
          (case_statement_alternative
            (discrete_choice_list
              (discrete_choice
                subtype_mark: (identifier)))
            (null_statement))
          (case_statement_alternative
            (discrete_choice_list
              (discrete_choice
                subtype_mark: (character_literal)))
            (null_statement)))))))
briot commented 1 year ago

Thanks for the work here, sorry I did not have time to look into that this week. I'll find the time to review next week