moby / buildkit

concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit
https://github.com/moby/moby/issues/34227
Apache License 2.0
8.16k stars 1.16k forks source link

[Feature Request] --progress=verbose #1556

Open RuRo opened 4 years ago

RuRo commented 4 years ago

A similar issue was already raised in the past (#526), but it got closed after the --progress option was implemented. In my opinion, the current --progress option conflates "tty support" and "output verbosity". As a result, there is currently no way to get a "verbose" human-readable output format.


For example, running docker build --no-cache --progress=tty with

this Dockerfile

FROM alpine:latest as base

FROM base as a
RUN for i in $(seq 1 10); do sleep 1; echo "a$i"; done

FROM base as b
RUN for i in $(seq 1 10); do sleep 1; echo "b$i"; done

FROM base
COPY --from=a . ./
COPY --from=b . ./

shows this log during the actual building process

[+] Building 10.1s (4/8)                                                                                                                                                      
 => [internal] load build definition from Dockerfile                                                                                                                     0.9s
 => => transferring dockerfile: 263B                                                                                                                                     0.0s
 => [internal] load .dockerignore                                                                                                                                        1.3s
 => => transferring context: 2B                                                                                                                                          0.0s
 => [internal] load metadata for docker.io/library/alpine:latest                                                                                                         0.0s
 => CACHED [base 1/1] FROM docker.io/library/alpine:latest                                                                                                               0.0s
 => [b 1/1] RUN for i in $(seq 1 10); do sleep 1; echo "b$i"; done                                                                                                       8.6s
 => => # b2                                                                                                                                                                  
 => => # b3                                                                                                                                                                  
 => => # b4                                                                                                                                                                  
 => => # b5                                                                                                                                                                  
 => => # b6                                                                                                                                                                  
 => => # b7                                                                                                                                                                  
 => [a 1/1] RUN for i in $(seq 1 10); do sleep 1; echo "a$i"; done                                                                                                       8.6s
 => => # a1                                                                                                                                                                  
 => => # a2                                                                                                                                                                  
 => => # a3                                                                                                                                                                  
 => => # a4                                                                                                                                                                  
 => => # a5                                                                                                                                                                  
 => => # a6                                                                                                                                                                  

but then collapses the output to this shorter log after the build is complete.

[+] Building 19.5s (9/9) FINISHED                                                                                                                                             
 => [internal] load build definition from Dockerfile                                                                                                                     0.9s
 => => transferring dockerfile: 263B                                                                                                                                     0.0s
 => [internal] load .dockerignore                                                                                                                                        1.3s
 => => transferring context: 2B                                                                                                                                          0.0s
 => [internal] load metadata for docker.io/library/alpine:latest                                                                                                         0.0s
 => CACHED [base 1/1] FROM docker.io/library/alpine:latest                                                                                                               0.0s
 => [b 1/1] RUN for i in $(seq 1 10); do sleep 1; echo "b$i"; done                                                                                                      12.5s
 => [a 1/1] RUN for i in $(seq 1 10); do sleep 1; echo "a$i"; done                                                                                                      13.3s
 => [stage-3 1/2] COPY --from=a . ./                                                                                                                                     1.4s 
 => [stage-3 2/2] COPY --from=b . ./                                                                                                                                     1.3s 
 => exporting to image                                                                                                                                                   1.1s 
 => => exporting layers                                                                                                                                                  1.1s 
 => => writing image sha256:afaa2ffa1a1dc5f77d93851187c3ebd623599a7ae5cab7ac9a969c2e5b1a0487                                                                             0.0s 

This behavior is undesirable, since most developers (in my experience) want to be able to look at the full logs during the build and after the build. The currently "recommended" way to get the full logs is to use --progress=plain.

However, running docker build --no-cache --progress=plain with the same Dockerfile results in

this unreadable mess, where the logs from different parallel stages are mixed together.

#2 [internal] load .dockerignore
#2 transferring context: 2B done
#2 DONE 0.8s

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 263B done
#1 DONE 1.1s

#3 [internal] load metadata for docker.io/library/alpine:latest
#3 DONE 0.0s

#4 [base 1/1] FROM docker.io/library/alpine:latest
#4 CACHED

#5 [a 1/1] RUN for i in $(seq 1 10); do sleep 1; echo "a$i"; done
#5 3.135 a1
#5 4.136 a2
#5 5.137 a3
#5 6.137 a4
#5 7.138 a5
#5 8.139 a6
#5 9.139 a7
#5 ...

#7 [b 1/1] RUN for i in $(seq 1 10); do sleep 1; echo "b$i"; done
#7 3.062 b1
#7 4.062 b2
#7 5.063 b3
#7 6.063 b4
#7 7.064 b5
#7 8.065 b6
#7 9.066 b7
#7 10.07 b8
#7 ...

#5 [a 1/1] RUN for i in $(seq 1 10); do sleep 1; echo "a$i"; done
#5 10.14 a8
#5 11.14 a9
#5 12.14 a10
#5 ...

#7 [b 1/1] RUN for i in $(seq 1 10); do sleep 1; echo "b$i"; done
#7 11.07 b9
#7 12.07 b10
#7 DONE 13.1s

#5 [a 1/1] RUN for i in $(seq 1 10); do sleep 1; echo "a$i"; done
#5 DONE 13.9s

#6 [stage-3 1/2] COPY --from=a . ./
#6 DONE 1.2s

#8 [stage-3 2/2] COPY --from=b . ./
#8 DONE 2.4s

#9 exporting to image
#9 exporting layers
#9 exporting layers 1.4s done
#9 writing image sha256:1a7aca4ff2346858fe26dbc7fac58687c369622baf2ab8ad849ac8bcbe47a9b1 0.0s done
#9 DONE 1.4s


So in short, --progress=tty only shows the last 6 lines from each build and clears the logs after the build is done and --progress=plain can't handle parallel builds properly and is overall actually quite ugly. In my opinion, we really need a --progress=verbose option, which would act exactly like --progress=tty, but the logs will not be limited to 6 lines and they will not get erased after the build step is complete.

I might be wrong, but I can't imagine that this behavior would be hard to implement, since you just need to turn off the log "truncation" at 7 lines and after the build step finishes.

tonistiigi commented 4 years ago

I might be wrong, but I can't imagine that this behavior would be hard to implement,

This actually seems impossible to implement. There are two modes of writing the progress: to the terminal and to the text stream. If we write to text stream then there is no way to inject anything in between the already written rows. So the only way is to block or to do (kind-of) smart switching like we do today. Otoh on tty mode we only have the screen coordinates to work with, we can replace the lines in current screen but we can't write anything outside of it or have content flowing.

I think the closest that you are asking and that would be useful would be to just keep the current tty mode but once the build is over and it reprints the summary, print all the logs as well and not hide them.

tonistiigi commented 4 years ago

Also, some tips are to pipe the plain progress to sort if you don't care about it blocking. What people have asked before is to also make the vertex switching configurable so you can set ENV that define how much you want the stream to block or switch.

RuRo commented 4 years ago

@tonistiigi I am not sure, that your assertion about how the tty mode currently works is true (at least it appears to work differently to me). For example, consider that when the build process starts, the terminal is not cleared and the only line that appears below the shell prompt is

[+] Building 0.0s (0/2)

Then, as the build progresses, lines are constantly added after this line. Meanwhile this line is constantly updated to keep track of the time, ending with

[+] Building 19.5s (9/9) FINISHED

The same is true about the "scrolling" logs. At the beginning, the logs are only 1 line high and are then growing until they reach 6 lines. So clearly, the size of the "virtual" terminal is constantly changing throughout the build. I see no theoretical reason, why the logs couldn't just continue growing past the current 6 lines.

Unfortunately, your suggestion about sorting the output of --progress=plain doesn't quite work. 1) Piping to sort or even sort -n/-h doesn't work because the # sign before the number forces alphabetical sorting, which puts #10 between #1 and #2. 2) All the empty lines get sorted to the top. 3) The lines inside each numbered block also get sorted. 4) The blocks are not contiguous and are instead broken up by #123 ...

Theoretically, I could write some program, that would properly parse the plain output and format it in the way I described in my original post, but it isn't really a feasible solution IMO as that would require me to ship a parser/printer program everywhere I want to build docker images, which is kind of ridiculous to be honest.

I really think, that having a workable human-readable output format is really the job of buildkit. If buildkit wants to eventually become the default docker build implementation, you will at least need to have a comparable quality of output and parallel builds are not a good enough excuse IMO.

tonistiigi commented 4 years ago

are then growing until they reach 6 lines

The 6 limit is of course arbitrary. I'm just saying that it can't be ever-growing. The way it keeps the 6 line limit is monitoring your current screen size, finding out if there is space for the logs, finding the lines where to write the logs are on your screen, clearing these lines, and then rewriting with new content.

@tiborvass I think I saw you using sort, did you set some special flags. Obviously this is just a workaround and not perfect.

RuRo commented 4 years ago

Ah. After some experimenting with the current --progress=tty I now understand what you mean. The current tty output mode is indeed dependent on the tty size. For example, if there are more lines in the output than the height of your terminal, then the output doesn't scroll, but instead gets truncated.

Dockerfile with really long output ```Dockerfile FROM alpine:latest as base FROM base as a00 RUN touch a00 FROM a00 as a01 RUN touch a01 FROM a01 as a02 RUN touch a02 FROM a02 as a03 RUN touch a03 FROM a03 as a04 RUN touch a04 FROM a04 as a05 RUN touch a05 FROM a05 as a06 RUN touch a06 FROM a06 as a07 RUN touch a07 FROM a07 as a08 RUN touch a08 FROM a08 as a09 RUN touch a09 FROM a09 as a10 RUN touch a10 FROM a10 as a11 RUN touch a11 FROM a11 as a12 RUN touch a12 FROM a12 as a13 RUN touch a13 FROM a13 as a14 RUN touch a14 FROM a14 as a15 RUN touch a15 FROM a15 as b00 RUN touch b00 FROM b00 as b01 RUN touch b01 FROM b01 as b02 RUN touch b02 FROM b02 as b03 RUN touch b03 FROM b03 as b04 RUN touch b04 FROM b04 as b05 RUN touch b05 FROM b05 as b06 RUN touch b06 FROM b06 as b07 RUN touch b07 FROM b07 as b08 RUN touch b08 FROM b08 as b09 RUN touch b09 FROM b09 as b10 RUN touch b10 FROM b10 as b11 RUN touch b11 FROM b11 as b12 RUN touch b12 FROM b12 as b13 RUN touch b13 FROM b13 as b14 RUN touch b14 FROM b14 as b15 RUN touch b15 FROM b15 as c00 RUN touch c00 FROM c00 as c01 RUN touch c01 FROM c01 as c02 RUN touch c02 FROM c02 as c03 RUN touch c03 FROM c03 as c04 RUN touch c04 FROM c04 as c05 RUN touch c05 FROM c05 as c06 RUN touch c06 FROM c06 as c07 RUN touch c07 FROM c07 as c08 RUN touch c08 FROM c08 as c09 RUN touch c09 FROM c09 as c10 RUN touch c10 FROM c10 as c11 RUN touch c11 FROM c11 as c12 RUN touch c12 FROM c12 as c13 RUN touch c13 FROM c13 as c14 RUN touch c14 FROM c14 as c15 RUN touch c15 FROM c15 as d00 RUN touch d00 FROM d00 as d01 RUN touch d01 FROM d01 as d02 RUN touch d02 FROM d02 as d03 RUN touch d03 FROM d03 as d04 RUN touch d04 FROM d04 as d05 RUN touch d05 FROM d05 as d06 RUN touch d06 FROM d06 as d07 RUN touch d07 FROM d07 as d08 RUN touch d08 FROM d08 as d09 RUN touch d09 FROM d09 as d10 RUN touch d10 FROM d10 as d11 RUN touch d11 FROM d11 as d12 RUN touch d12 FROM d12 as d13 RUN touch d13 FROM d13 as d14 RUN touch d14 FROM d14 as final RUN touch d15 ```

Damn, what a bummer. I am once again confronted by the fact that sh/bash/term are such a huge load of horrible legacy nightmares. And this limitation really seems to be quite deep in the animal brain of VT100. 😩

Hmm, okay. Can we think of some workaround/compromise involving partial "blocking/buffering" with --progress=plain during parallel stages? The current breaking of blocks with ... is really unusable IMO.

I guess, I am thinking of something like the current output format of docker build, but in the event, that there are parallel build steps, the logs are printed in their order of appearance in the Dockerfile.

Maybe something along the lines of this (Assuming same `Dockerfile` as in original post) ``` Sending build context to Docker daemon 2.048kB ---> DONE 4.4s Step [base 1/1] : FROM docker.io/library/alpine:latest ---> e7d92cdc71fe CACHED Step [a 1/1] : RUN for i in $(seq 1 10); do sleep 1; echo "a$i"; done // Step [b 1/1] is also concurrently running at this point a1 // But its output is getting buffered and will only a2 // be displayed after Step [a 1/1] completes a3 a4 a5 a6 a7 a8 a9 a10 ---> cc80f0af4e2e DONE 14.0s Step [b 1/1] : RUN for i in $(seq 1 10); do sleep 1; echo "b$i"; done b1 b2 b3 b4 b5 b6 b7 b8 b9 b10 ---> d82096840c4c DONE 15.3s Step [stage-3 1/2] : COPY --from=a . ./ ---> 00b49df6d0f4 Step [stage-3 2/2] : COPY --from=b . ./ ---> a003b71c5ad6 Successfully built a003b71c5ad6 ```
bss-dmitry-shmakov commented 8 months ago

I wonder if anyone was able to solve the mixed up stdout that github actions log streams when buildkit runs with --progress=plain ? it builds stages of dockerfile in parallel and writes logs in a mix as well, so it's impossible to understand what is what. is there a solution for that? (since i see this thread is still open, maybe a WIP?)