camdencheek / tree-sitter-dockerfile

A tree-sitter grammar for Dockerfile
MIT License
71 stars 20 forks source link

Some tokens incorrectly contain whitespace #19

Closed mjambon closed 2 years ago

mjambon commented 2 years ago

a b is not a valid image name but it's accepted by the parser:

$ tree-sitter generate && tree-sitter parse <(echo 'from a b')
(source_file [0, 0] - [1, 0]
  (from_instruction [0, 0] - [0, 8]
    (image_spec [0, 5] - [0, 8]
      name: (image_name [0, 5] - [0, 8]))))

This is because the rule is specified as

    image_name: ($) =>
      seq(
        choice(/[^@:\s\$-]/, $.expansion),
        repeat(choice(/[^@:\s\$]+/, $.expansion))
      ),

which allows whitespace between the fragments that form the image name.

Solving this would solve the problem we're running into (#18), where the AS in FROM a AS b is parsed as a fragment of an image_name a AS b instead of being parsed as the AS keyword (when using a grammar with reordered rules).

I'm looking into a possible solution.

camdencheek commented 2 years ago

Fixed by #20