softdevteam / lrpar

Rust LR parser
Other
1 stars 0 forks source link

Multiple %implicit_tokens #30

Closed snim2 closed 7 years ago

snim2 commented 7 years ago

It seems that multiple token types passed to %implicit_tokens get ignored.

Reproducing

Using the files java.l and java.y from here: https://github.com/softdevteam/diffract/tree/master/grammars

If you change the top of the parser to:

%implicit_tokens WHITESPACE SINGLE_LINE_COMMENT

and the last few lines of the lexer to:

'(?:\\'|[^'\n])*' CHARACTER_LITERAL
"(?:\\"|[^"\n])*" STRING_LITERAL
//[^\n]*?$ SINGLE_LINE_COMMENT
[ \t\n\r]+ WHITESPACE

then this file:

public class Comment {
    public final boolean HAS_COMMENTS = true;
    // Single line comment.
}

fails to parse, with this error:

$ ./target/debug/lrpar -y eco ../diffract/grammars/java.l ../diffract/grammars/java.y ../diffract/tests/Comment.java 
^~
 ~
 goal
  compilation_unit
   package_declaration_opt
   import_declarations_opt
   type_declarations_opt
    type_declarations
     type_declaration
      class_declaration
       modifiers_opt
        modifiers
         modifier
          PUBLIC public
          ~
           WHITESPACE  
       CLASS class
       ~
        WHITESPACE  
       IDENTIFIER Comment
       ~
        WHITESPACE  
       type_parameters_opt
       super_opt
       interfaces_opt
       class_body
        LBRACE {
        ~
         WHITESPACE 

        class_body_declarations_opt
         class_body_declarations
          class_body_declarations
           class_body_declaration
            class_member_declaration
             field_declaration
              modifiers_opt
               modifiers
                modifiers
                 modifier
                  PUBLIC public
                  ~
                   WHITESPACE  
                modifier
                 FINAL final
                 ~
                  WHITESPACE  
              type
               primitive_type
                BOOLEAN boolean
                ~
                 WHITESPACE  
              variable_declarators
               variable_declarator
                variable_declarator_id
                 IDENTIFIER HAS_COMMENTS
                 ~
                  WHITESPACE  
                EQ =
                ~
                 WHITESPACE  
                variable_initializer
                 expression
                  assignment_expression
                   conditional_expression
                    conditional_or_expression
                     conditional_and_expression
                      inclusive_or_expression
                       exclusive_or_expression
                        and_expression
                         equality_expression
                          instanceof_expression
                           relational_expression
                            shift_expression
                             additive_expression
                              multiplicative_expression
                               unary_expression
                                unary_expression_not_plus_minus
                                 postfix_expression
                                  primary
                                   primary_no_new_array
                                    literal
                                     BOOLEAN_LITERAL true
                                     ~
              SEMICOLON ;
              ~
               WHITESPACE 

          class_body_declaration
           block
            LBRACE 
            ~
             WHITESPACE 

            block_statements_opt
            RBRACE }
            ~
             WHITESPACE 

        RBRACE 
        ~

Error detected at line 3 col 5. Amongst the valid repairs are:
  Insert "LBRACE", Delete "// Single line comment."
  Insert "SEMICOLON", Delete "// Single line comment."
  Delete "// Single line comment.", Delete "\n"
  Insert "LBRACE", Keep "// Single line comment.", Insert "LBRACE"
  Insert "LBRACE", Keep "// Single line comment.", Insert "RBRACE"
  Insert "LBRACE", Keep "// Single line comment.", Insert "SEMICOLON"
  Insert "STATIC", Keep "// Single line comment.", Insert "LBRACE"
  Insert "SEMICOLON", Keep "// Single line comment.", Insert "LBRACE"
  Insert "SEMICOLON", Keep "// Single line comment.", Insert "SEMICOLON"
  Insert "SEMICOLON", Keep "// Single line comment.", Delete "\n"
Error detected at line 5 col 1. Amongst the valid repairs are:
  Insert "RBRACE"
ltratt commented 7 years ago

OK, I can reproduce this. I'm not quite sure what's going on yet though...