tree-sitter-grammars / tree-sitter-hcl

HCL grammar for tree-sitter
https://tree-sitter-grammars.github.io/tree-sitter-hcl/
Apache License 2.0
95 stars 20 forks source link

Add a shim to move a comment extra into a body node #31

Closed ahlinc closed 1 year ago

ahlinc commented 1 year ago

A demo of the concept how to cheat parser and move a comment extra node inside of a body node.

Screenshot ![Screenshot from 2023-04-08 07-02-02](https://user-images.githubusercontent.com/14666676/230702306-c1ca63f2-33dc-4cdd-a2bd-4c7de754da3a.png)

Closes #30

MichaHoffmann commented 1 year ago

Mh i messed up CI at some point, ill fix it later. I added some tests locally to explore the PR and most are passing, this one here fails though:

================================================================================
comment in empty block body
================================================================================

block {
  //foo
}

with

$ tree-sitter test -f "comment in empty" -d
  attributes:
  collections:
  conditionals:
  for_expressions:
  function_calls:
  literals:
  operators:
  splat:
  real_world:
  strings:
  templates:
  blocks:
  comments:
new_parse
process version:0, version_count:1, state:1, row:0, col:0
lex_external state:2, row:0, column:0
  skip character:10
lexed_lookahead sym:_shim, size:1
shift state:444
process version:0, version_count:1, state:444, row:1, col:0
lex_internal state:41, row:1, column:0
  consume character:'b'
  consume character:'l'
  consume character:'o'
  consume character:'c'
  consume character:'k'
lexed_lookahead sym:identifier, size:5
shift state:406
process version:0, version_count:1, state:406, row:1, col:5
lex_external state:3, row:1, column:5
  skip character:' '
lex_internal state:4, row:1, column:5
  consume character:' '
lexed_lookahead sym:_whitespace, size:1
shift_extra
process version:0, version_count:1, state:406, row:1, col:6
lex_external state:3, row:1, column:6
lex_internal state:4, row:1, column:6
  consume character:'{'
lexed_lookahead sym:{, size:1
shift state:543
process version:0, version_count:1, state:543, row:1, col:7
lex_external state:2, row:1, column:7
  skip character:10
  skip character:' '
  skip character:' '
lexed_lookahead sym:_shim, size:3
reduce sym:block_start, child_count:1
shift state:444
process version:0, version_count:1, state:444, row:2, col:2
lex_internal state:41, row:2, column:2
  consume character:'/'
  consume character:'/'
  consume character:'f'
  consume character:'o'
  consume character:'o'
lexed_lookahead sym:comment, size:5
shift_extra
process version:0, version_count:1, state:444, row:2, col:7
lex_internal state:41, row:2, column:7
  consume character:10
lexed_lookahead sym:_whitespace, size:1
shift_extra
process version:0, version_count:1, state:444, row:3, col:0
lex_internal state:41, row:3, column:0
  consume character:'}'
lexed_lookahead sym:}, size:1
detect_error
resume version:0
recover_to_previous state:404, depth:2
skip_token symbol:}
process version:0, version_count:2, state:0, row:3, col:1
lex_external state:1, row:3, column:1
  skip character:10
lex_internal state:0, row:3, column:1
  consume character:10
lexed_lookahead sym:_whitespace, size:1
shift_extra
process version:1, version_count:2, state:404, row:3, col:0
lex_external state:2, row:3, column:0
lex_internal state:41, row:3, column:0
  consume character:'}'
lexed_lookahead sym:}, size:1
shift state:501
process version:1, version_count:2, state:501, row:3, col:1
lex_internal state:41, row:3, column:1
  consume character:10
lexed_lookahead sym:_whitespace, size:1
shift_extra
condense
process version:0, version_count:1, state:501, row:4, col:0
lex_internal state:41, row:4, column:0
lexed_lookahead sym:end, size:0
reduce sym:block_end, child_count:1
reduce sym:block, child_count:3
reduce sym:body, child_count:2
reduce sym:config_file, child_count:1
accept
done
    ✗ comment in empty block body

1 failure:

expected / actual

  1. comment in empty block body:

    (config_file
      (body
        (block
          (identifier)
          (block_start)
          (ERROR)
          (comment)
          (block_end))))

I wonder if comments confuse the external scanner who wants to see a SHIM but sees a comment instead; do you have an idea?

ahlinc commented 1 year ago

Fixed!

Screenshot ![Screenshot from 2023-04-08 15-36-16](https://user-images.githubusercontent.com/14666676/230721388-ff6fe16d-7eae-443c-bd28-1e1e02650432.png)
MichaHoffmann commented 1 year ago

Awesome! Thank you very much; ill add tests and some handling for multiline and "//" comments in a seperate PR!

MichaHoffmann commented 1 year ago

Ah i think i merged a bit too soon without the pipeline

#foo

without trailing newline now produces endless loop; ill fix it in a followup PR

ahlinc commented 1 year ago

Ah i think i merged a bit too soon without the pipeline

#foo

without trailing newline now produces endless loop; ill fix it in a followup PR

@MichaHoffmann you need to add a check for EOF by using lexer->eof(lexer);

And extend skip_comment function to cover all other comment veriants.

Also it makes sense to introduce optimization to avoid long double scans and for this just detect comment starts and fallback to the builtin lexer for consuming.