camdencheek / tree-sitter-dockerfile

A tree-sitter grammar for Dockerfile
MIT License
71 stars 20 forks source link

Add ARG support for multiple variables in one instruction #54

Open mjambon opened 3 months ago

mjambon commented 3 months ago

The official documentation makes no mention of supporting multiple variables in a single ARG instruction (https://docs.docker.com/reference/dockerfile/#arg).

However, the docker command supports multiple variables. For example, the following does what one would expect:

FROM busybox
ARG a=1 b=2
RUN echo "a: $a"
RUN echo "b: $b"

i.e. it prints a: 1 and b: 2.

ARG can be used to declare variables without a default. Their value can then be set with --build-arg. The following declares two variables a and b:

FROM busybox
ARG a b

Where it gets weird is that it doesn't support comments but it identifies # as an ordinary string. The following declares 3 variables a, #, and b:

FROM busybox
ARG a # b

The following sets a to 2 even though it seems to be commented out:

$ cat bad-arg.dockerfile 
FROM busybox
ARG a=1 # a=2
RUN echo "a: $a"
RUN echo "#: $#"

$ docker build -f bad-arg.dockerfile --build-arg '#=oh no' .
[+] Building 1.0s (7/7) FINISHED                                                
 => [internal] load .dockerignore                                          0.0s
 => => transferring context: 2B                                            0.0s
 => [internal] load build definition from bad-arg.dockerfile               0.0s
 => => transferring dockerfile: 106B                                       0.0s
 => [internal] load metadata for docker.io/library/busybox:latest          0.0s
 => CACHED [1/3] FROM docker.io/library/busybox                            0.0s
 => [2/3] RUN echo "a: 2"                                                  0.4s
 => [3/3] RUN echo "#: oh no"                                              0.5s
 => exporting to image                                                     0.0s
 => => exporting layers                                                    0.0s
 => => writing image sha256:742fdc7edc6015ff3a4612019326b712f88571dc82cba  0.0s

tree-sitter-dockerfile doesn't currently support this syntax. We might want to support it even though its use should be discouraged.

(In Semgrep, we'd probably add a rule that forbids the use of # as a variable name and also maybe forbid multiple variables in a single ARG instruction.)