camdencheek / tree-sitter-dockerfile

A tree-sitter grammar for Dockerfile
MIT License
71 stars 20 forks source link

Add support for composite strings such as a$B"c d" #38

Open mjambon opened 1 year ago

mjambon commented 1 year ago

Here's a Dockerfile with a multi-fragment string ("a"b'c'):

$ cat Dockerfile 
FROM busybox
ENV A "a"b'c'
RUN echo "$A"

Docker interprets "a"b'c' as abc as shown by the docker build log:

$ docker build .
[+] Building 0.6s (6/6) FINISHED                                                
 => [internal] load .dockerignore                                          0.0s
 => => transferring context: 2B                                            0.0s
 => [internal] load build definition from Dockerfile                       0.0s
 => => transferring dockerfile: 78B                                        0.0s
 => [internal] load metadata for docker.io/library/busybox:latest          0.0s
 => CACHED [1/2] FROM docker.io/library/busybox                            0.0s
 => [2/2] RUN echo "abc"                                                   0.5s
 => exporting to image                                                     0.0s
 => => exporting layers                                                    0.0s
 => => writing image sha256:ff61e5984a58fdc43942411aa0e8a51d68417d11f91dd  0.0s

The tree-sitter-dockerfile parser doesn't like "a"b'c':

$ tree-sitter parse Dockerfile 
(source_file [0, 0] - [3, 0]
  (from_instruction [0, 0] - [0, 12]
    (image_spec [0, 5] - [0, 12]
      name: (image_name [0, 5] - [0, 12])))
  (env_instruction [1, 0] - [1, 13]
    (env_pair [1, 4] - [1, 13]
      name: (unquoted_string [1, 4] - [1, 5])
      (ERROR [1, 6] - [1, 10]
        (double_quoted_string [1, 6] - [1, 9])
        (unquoted_string [1, 9] - [1, 10]))
      value: (single_quoted_string [1, 10] - [1, 13])))
  (run_instruction [2, 0] - [2, 13]
    (shell_command [2, 4] - [2, 13]
      (shell_fragment [2, 4] - [2, 13]))))
Dockerfile  0 ms    (ERROR [1, 6] - [1, 10])