dockerfile: carriage return conflicting with heredocs

tonistiigi commented 1 year ago

On Windows clients, the editor can add a carriage return character next to the newline. In certain situations this can cause the build to fail.

Eg.

 # cat Dockerfile| base64
ZnJvbSBhbHBpbmUNCnJ1biBlY2hvIG9sZFwNCiAgc3R5bGUgPiAvYmFyCnJ1biA8PGVvdA0KICBzZXQgLWUNCiAgZWNobyBoZWxsbyA+IGZvbw0KZW90DQo=

from alpine
run echo old\
  style > /bar
run <<eot
  set -e
  echo hello > foo
eot

Will fail with

 => ERROR [3/3] RUN <<eot (set -e...)                                                                                                                                                                                                           0.3s
------
 > [3/3] RUN <<eot (set -e...):
0.252 /bin/sh: set: line 0: illegal option -
------
Dockerfile:4
--------------------
   3 |       style > /bar
   4 | >>> run <<eot
   5 | >>>   set -e
   6 | >>>   echo hello > foo
   7 | >>> eot
   8 |
--------------------

failed to solve: process "/bin/sh -c set -e\r\n echo hello > foo\r\n" you can see the \r after set -e that is causing the issue.

To fix it for RUN statements seems trivial as we can just replace the values before passing. For COPY it gets more tricky as it isn't completely out of the question that the user wants to copy CR character. We could just not allow that. Maybe there are better solutions?

I'm not sure if there are any other possible cases outside of Heredocs. Above I provided an example of an old style \ line continuation, and it doesn't seem to be affected as it just concatenates everything into a single line.

This also could cause cache misses between Windows and Unix clients even if we patch the error cases. Ideally CR should not cause any behavior differences.

@jedevc @AkihiroSuda

jedevc commented 1 year ago

Ah, looks like we hit this before as well: https://github.com/moby/buildkit/issues/3329.

To fix it for RUN statements seems trivial as we can just replace the values before passing.

I actually don't think this is trivial, e.g. the example in the linked issue:

RUN <<EOT
  echo "foo
  bar"
EOT

It's probably not a good idea to just trim the carriage return in the middle of that string, for the same reason we might not want to do it in COPY.

Here's a few solution ideas I had, but I don't really like any of them:

Just remove all carriage returns all the time. Potentially a breaking change, also this might play havoc with some windows executor stuff maybe?
Remove the carriage returns depending on the target platform - linux targets always removed, windows targets always kept. Inconsistent behavior is weird though.
Remove the carriage returns depending on the client platform. Even worse, this is both inconsistent and results in cache misses :)

This also could cause cache misses between Windows and Unix clients even if we patch the error cases. Ideally CR should not cause any behavior differences.

Hopefully this shouldn't be an issue today. e.g. if I clone a git repo, the line-endings are the same, right? I don't think they're automatically converted or anything like that. As long as the two platform clients are building from the same source of truth, I don't think it should be a problem.

tonistiigi commented 1 year ago

It's probably not a good idea to just trim the carriage return in the middle of that string, for the same reason we might not want to do it in COPY.

What issue would it cause? I tend to think we should even do it for COPY. At least when not targeting windows (@gabriel-samfira ). We already merge lines that end with \ so there is a precedent for not keeping the original bytes.

e.g. if I clone a git repo, the line-endings are the same, right?

I believe they are not. I guess it depends on how the Git is configured but I think git converts them for you.

From google image search:

jedevc commented 1 year ago

I tend to think we should even do it for COPY

I'm happy with this, but then, I don't really use windows, so don't really end up with CRLF anyways.

At least when not targeting windows

This shouldn't be conditional. E.g. if we had a single dockerfile used for building multiple linux and windows targets, whether the CRLF is present shouldn't differ between platform.

TBBle commented 1 year ago

Worth noting that on Windows, CMD is one of the few (only?) remaining tools in the Windows ecosystem that can't handle LF-only scripts. And it's still the default SHELL for Dockerfile. This is the inverse problem of what we're seeing here, that the default SHELL on Linux (and most POSIX-based shells I've tried over the years) can't handle CRLFs in their shell scripts. (I actually don't know if a build of bash on Windows can handle CRLF-terminated shell scripts... I avoid scripting against bash on Windows where possible for platform-mismatch reasons like this.)

So since these ecosystem pieces aren't going to change, I suspect that for RUN heredocs specifically, we will need to do "magic newlines" based on target platform (Windows/Non-Windows) to make this least-surprising for users particularly with mixed-platform Dockerfiles. (The alternative for this case, mixed-line-ending Dockerfiles, is technically feasible, but some editors and RCS's will silently fix that for you, so it's an ecosystem nightmare.)

By "magic newlines", I mean first (and early) trivially canonicalising the heredoc to LF, using that form for everything, and then at or near final output time, if targeting Windows, trivially CRLF-ise it. That way, when looking at "did the RUN line change from what's in the cache", a file newline switch is not seen as a change, e.g. when someone checks out the Dockerfile with git configured with ~~the painful default~~ core.autocrlf=true, or runs dos2unix on it for fun.

I say "trivial" here because if the user provides CRCRLF-terminated files due to either pathological behaviour, misconfiguration of Perforce, or really misusing git, then that's on them. So we just change any CRLF we see into LF, and invert by prepending CR to any LF we see. (That also puts us in the "silently fixing mixed-line-endings" club...)

Anyway, on the above basis (that the core need for this magic is due to the shell) I don't think the above rationale applies to COPY heredocs. There's probably an argument to be made for COPY heredocs being used to write script files, but I am much less certain of how often that's something that is actually done.

Auto-LFing COPY heredocs is still a little bit fraught, but apart from CMD scripts I'm unaware of any line-based fileformat in common use that requires CRLFs and rejects LF-only, so only a little bit.

An additional option would be a parameter for the RUN or COPY letting you specify (or override) the eol behaviour. That feels like more UX-cost than value to me, but that's just my opinion.

tonistiigi commented 1 year ago

This shouldn't be conditional. E.g. if we had a single dockerfile used for building multiple linux and windows targets, whether the CRLF is present shouldn't differ between platform.

I can't think of a case where the same command (chain) would be functional for both linux and wcow targets and require the same cache layer. Internally that is even impossible because wcow tar layers have a different format so you can mix them with linux layers.

Theoretically, we could normalize the newlines for the cache checksum even if they differ in the actual execution (I think that's what @TBBle suggested as well), but I'm not sure if that is even worth it or maybe better to avoid that for cache safety.

TBBle commented 1 year ago

Roughly, yeah. I was leaning towards normalised newlines for everything except the "write the file out to disk for the shell to execute" step, e.g., history annotations, debugging output, etc. We normalise every other newline in the file by dint of removing them when splitting the Dockerfile into commands, and then potentially outputting them line-by-line during the build. (I hope... if a CRLF-format Dockerfile leads to random extra CRs in the output logs, that's a different, horrible, issue)

I was thinking of cache hits more in terms of the same Dockerfile for the same platform, but run from different RCS checkouts, or before-and-after an editor normalises or renormalises newlines in the file.

Basically, AFAICT, if we don't normalise and then (as late as possible) platformise newlines in the heredocs, the only way to have different RUN heredocs for different platforms in the same Dockerfile will be to have mixed line-endings, which is IMHO a worst-of-both-worlds result for everyone. (Based on this bug, that's actually the current state.)

moby / buildkit

dockerfile: carriage return conflicting with heredocs #4282