semgrep / semgrep

Lightweight static analysis for many languages. Find bug variants with patterns that look like source code.
https://semgrep.dev
GNU Lesser General Public License v2.1
10.46k stars 603 forks source link

Dockerfile syntax parsing error | dockerflie language | Engine(PartialParsing) #10068

Open ghost opened 5 months ago

ghost commented 5 months ago

Describe the bug When i run a new dockerfile rule on a yaml file of mine for building a docker image , i get Syntax error at line target.dockerfile:24: --> Engine(PartialParsing)

To Reproduce Steps to reproduce the behavior, (https://semgrep.dev/playground/s/kxK2P)

What is the priority of the bug to you?

Environment all of the environments

jkinsfather commented 5 months ago

The reason why the YAML fails to parse is because the shell command on line 24 has nine instances where the closing square bracket is escaped but their corresponding opening square brackets are not escaped.

If this inconsistency is fixed then the file parses as expected: RUN echo "PS1=\"[\e[91m]\u[\e[0m][\e[2m]@[\e[0m][\e[94m]\h[\e[0m]:[\e[32m]\w[\e[0m] [\e[2m]=>[\e[0m] \""

mjambon commented 5 months ago

@jkinsfather's code above fixes the code for the Bash prompt (which originally showed spurious closing brackets) and it also serves as a workaround for semgrep's parsing bug.

I'm reopening the issue because there's still a bug in Semgrep's Dockerfile parser. The following Dockerfile is accepted by docker but not by semgrep:

FROM ubuntu

# sh command to set text color to red (does nothing useful but is valid)
RUN echo "\e[91m"

Edit: the command echo "\e[91m" sets the text in red in sh (the default shell used by Docker to process the RUN instructions) but not in bash, although it's syntactically valid in both cases.

docker is happy:

$ docker build -t test .
[+] Building 0.0s (6/6) FINISHED                                                
 => [internal] load build definition from Dockerfile                       0.0s
 => => transferring dockerfile: 112B                                       0.0s
 => [internal] load .dockerignore                                          0.0s
 => => transferring context: 2B                                            0.0s
 => [internal] load metadata for docker.io/library/ubuntu:latest           0.0s
 => [1/2] FROM docker.io/library/ubuntu                                    0.0s
 => CACHED [2/2] RUN echo "\e[91m"                                         0.0s
 => exporting to image                                                     0.0s
 => => exporting layers                                                    0.0s
 => => writing image sha256:c0f4394029c2ee3b806186b37a86ccf79c5bf6057f85d  0.0s
 => => naming to docker.io/library/test                                    0.0s

But semgrep fails to parse the RUN line:

$ semgrep -l docker -e 'RUN ...' Dockerfile --verbose
No .semgrepignore found. Using default .semgrepignore rules. See the docs for the list of default ignores: https://semgrep.dev/docs/cli-usage/#ignoring-files
Rules:
- -
[WARN] Syntax error at line Dockerfile:4:
 `RUN echo "\e[91m"` was unexpected

...