While writing the comment above, I noticed that the Dockerfile parsing properly detects unterminated here-documents, but doesn't provide a hint if it's invalid due to the end marker being indented. For example;
FROM alpine
RUN <<'EOT'
env
EOT
Building the above produces an error indicating that the here-document is not terminated;
[+] Building 0.1s (1/1) FINISHED docker:desktop-linux
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 77B 0.0s
Dockerfile:2
--------------------
1 | FROM alpine
2 | >>> RUN <<'EOT'
3 | >>> env
4 | >>> EOT
5 |
--------------------
ERROR: failed to solve: unterminated heredoc
Not all users may be aware of the requirements for the end-marker to be on position 0, so this can be a common mistake. In addition, in the example above, the indentation is quite visible, but may be much harder to find if (e.g.) it's only a single space and/or when printed as part of CI logs (which tend to indent output).
Initially, I thought it was "smart" enough to detect where it SHOULD be terminated, but from a quick test, it looks like it just marks "everything after" as part of the here-document. For example, adding some more instructions after, will include all of those in the error message;
FROM alpine
RUN <<'EOT'
env
EOT
RUN echo hello
RUN echo world
RUN echo foobar
[+] Building 0.1s (1/1) FINISHED docker:desktop-linux
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 138B 0.0s
Dockerfile:2
--------------------
1 | FROM alpine
2 | >>> RUN <<'EOT'
3 | >>> env
4 | >>> EOT
5 | >>> RUN echo hello
6 | >>> RUN echo hello
7 | >>> RUN echo world
8 | >>> RUN echo foobar
9 |
--------------------
ERROR: failed to solve: unterminated heredoc
☝️ while the above is correct, I wonder if we could have smarter handling of the above.
Suggested improvements
If possible, it would be great if we could detect potentially indented end-marker. I'm using "potentially" here, because the here-document itself could also use here-doc, e.g.
FROM alpine
RUN <<'EOT'
cat > hello.txt <<EOT
hello world
EOT
EOT
RUN echo hello
When parsing failed because no end-marker was found, the above information could be used for printing the error and to provide a more targeted solution;
Dockerfile:2
--------------------
1 | FROM alpine
2 | >>> RUN <<'EOT'
3 | >>> cat > hello.txt <<EOT
4 | >>> hello world
5 | >>> EOT
6 | >>> EOT
7 | RUN echo hello
8 |
--------------------
ERROR: failed to solve: unterminated heredoc: end-marker at line 6 is indented.
Perhaps we could even omit intermediate lines (assuming here-docs can be long!) and point out the start and (expected) end.
Dockerfile:2
--------------------
1 | FROM alpine
2 | >>> RUN <<'EOT'
...
97 | >>> EOT
98 | RUN echo hello
99 |
--------------------
ERROR: failed to solve: unterminated heredoc: heredoc starts at line 2 but the end-marker (EOT) at line 97 is indented.
While writing the comment above, I noticed that the Dockerfile parsing properly detects unterminated here-documents, but doesn't provide a hint if it's invalid due to the end marker being indented. For example;
Building the above produces an error indicating that the here-document is not terminated;
Not all users may be aware of the requirements for the end-marker to be on position 0, so this can be a common mistake. In addition, in the example above, the indentation is quite visible, but may be much harder to find if (e.g.) it's only a single space and/or when printed as part of CI logs (which tend to indent output).
Initially, I thought it was "smart" enough to detect where it SHOULD be terminated, but from a quick test, it looks like it just marks "everything after" as part of the here-document. For example, adding some more instructions after, will include all of those in the error message;
☝️ while the above is correct, I wonder if we could have smarter handling of the above.
Suggested improvements
If possible, it would be great if we could detect potentially indented end-marker. I'm using "potentially" here, because the here-document itself could also use here-doc, e.g.
When parsing failed because no end-marker was found, the above information could be used for printing the error and to provide a more targeted solution;
Perhaps we could even omit intermediate lines (assuming here-docs can be long!) and point out the start and (expected) end.