Open tonistiigi opened 2 years ago
:+1:
I'm recently working on interactive debugger for Dockerfile. So I'm willing to working on this feature.
About breakpoints:
Now this DefinitionOp can be mutated to build up the the breakpoint and then evaluated and run interactively.
But doesn't this end up evaluating vertexes from the beggining until a breakpoint repeatedly if user sets multiple breakpoints? Alternative way will be adding support of "wrapping" a worker like done by buildg. This implements worker wrapper that supports breakpoints directly. https://github.com/ktock/buildg/blob/d2d03f80dbcf269626a93f11b4557a42bebcfacf/debug.go#L178-L322
But doesn't this end up evaluating vertexes from the beggining until a breakpoint repeatedly if user sets multiple breakpoints?
Ah, I didn't notice that you are even wrapping the Op
interface. In practice sending the previous LLB with extra ops does not affect the performance. Previous LLB has already been solved and is inside the active solve graph so when new LLB with the same digest appears it is just directly matched to the existing nodes. This is how the DefinitionOp
and parallel targets in bake work as well for example.
@ktock a massive thanks for all the hard work you've put in so far - I think we're probably close to being able to close this issue :tada: :tada:
Hopefully we'll get some users to try out and give feedback for this in the upcoming v0.11 release.
All flag names up to later discussion. Naming is hard.
I've been trying to think about how we might work on this, the current naming is a bit tricky I think for new users, and I think we need to guard against adding too many new args to normal commands. To be more specific:
--invoke
is a bit of an internal implementation detail. I think we want to demonstrate that this is a "debugging" feature, and a name like --invoke
doesn't indicate to a user that that's what it does or how it works. I think using invoke
internally makes sense, but it makes less sense to expose it to the end-user.--root
, --detach
and --server-config
are not options that I expect most users to actually use regularly - so we should work to hide these from the user as much as possible. We could have --detach
inside the remote server config itself for example. I'm less sure about --root
and --server-config
- maybe we could put these at the root of the CLI, or even as environment variables?My first idea was that we could try and rename everything to be debug
themed. So, in my head, that would mean that:
--invoke
to --debug
debug-shell
to just debug
Unfortunately for us... --debug
is already claimed as a top-level docker CLI option, and does something entirely different. So we'd be in the weird situation of having an option behave differently depending on where it was in the user command, which isn't great.
We could change debug
to be something like dev
- I guess it's kind of like a "development mode" for Dockerfiles, so that could work.
@crazy-max suggested to me an idea I like a lot better, just have everything debug-related under a single, top-level debug
command:
The details (as I imagine them):
buildx debug-shell
would just become buildx debug shell
or even just buildx debug
.
buildx build
would remain unchanged from the current user experience. We wouldn't have any debugging flags here, it's all about build (though we still should use the controller api).
buildx debug build
would allow all the same options that are currently in build to be specified after the build
component. However, we could add any generic debugging flags before the build
component, such as --on
, e.g.:
$ buildx debug --on=error build . --target <target-in-dockerfile>
That way, we split out the debugging flags from the build flags, and also allow integrating that more neatly with bake in the future:
$ buildx debug --on=error bake <target>
buildx debug
would allow "extra" args after a --
component, which would be run directly in the monitor, before stdin is connected. For example, to break on line 10, and drop into a shell:
$ buildx debug build . --target <target-in-dockerfile> -- break 10\; exec sh
I actually really like the flow of this, it reads really easily, and it's not hard to modify a command that you run to be debugged. Also, all of the debugging is in one place, and doesn't need to be spread across multiple commands (even if in the code, that might not be 100% true).
I'm curious what people think about the above ideas, or if anyone has alternative ideas we should consider - I think we should work out what we want to do before the next buildx release (and hopefully implement it!), so we can start to get some feedback from users.
Posted by @tonistiigi at https://github.com/docker/buildx/pull/2006#pullrequestreview-1672830646
Follow-up.
--on=error
on a container error gets me in the container but there is no context of what happened, what was the last command etc. I think the help command or monitor messages should give context about what is the current interactive context (build result for specific target, error result from command) so there is context of what gets run onexec/reload/rollback
.Follow-up. We need
ls
command that would list the files in the current dir. If I get error likerunc run failed: unable to start container process: exec: "sh": executable file not found in $PATH
then I have no idea what I'm missing. This could also be exec of debug image what has the current mounts mounted somewhere.
Posted by @tonistiigi at https://github.com/docker/buildx/pull/2006#pullrequestreview-1681377351
- Let's say I have two different types of errors. One is wrong Dockerfile command and another is process error.
Dockerfile:60 -------------------- 59 | ARG TARGETPLATFORM 60 | >>> RUN2 --mount=type=bind,target=. \ 61 | >>> --mount=type=cache,target=/root/.cache \ 62 | >>> --mount=type=cache,target=/go/pkg/mod \ 63 | >>> --mount=type=bind,from=buildx-version,source=/buildx-version,target=/buildx-version <<EOT 64 | set -e -------------------- ERROR: dockerfile parse error on line 60: unknown instruction: RUN2 (did you mean RUN?) [+] Building 0.0s (0/0) docker:desktop-linux Launching interactive container. Press Ctrl-a-c to switch to monitor console Interactive container was restarted with process "o5e8x1ty9nn2j93m62b8zmhdn". Press Ctrl-a-c to switch to the new container Switched IO
------ Dockerfile:60 -------------------- 59 | ARG TARGETPLATFORM 60 | >>> RUN --mount=type=bind,target=. \ 61 | >>> --mount=type=cache,target=/root/.cache \ 62 | >>> --mount=type=cache,target=/go/pkg/mod \ 63 | >>> --mount=type=bind,from=buildx-version,source=/buildx-version,target=/buildx-version <<EOT 64 | >>> set -e 65 | >>> xx-go2 --wrap 66 | >>> DESTDIR=/usr/bin VERSION=$(cat /buildx-version/version) REVISION=$(cat /buildx-version/revision) GO_EXTRA_LDFLAGS="-s -w" ./hack/build 67 | >>> xx-verify --static /usr/bin/docker-buildx 68 | >>> EOT 69 | -------------------- ERROR: process "/bin/sh -c set -e\n xx-go2 --wrap\n DESTDIR=/usr/bin VERSION=$(cat /buildx-version/version) REVISION=$(cat /buildx-version/revision) GO_EXTRA_LDFLAGS=\"-s -w\" ./hack/build\n xx-verify --static /usr/bin/docker-buildx\n" did not complete successfully: exit code: 127 [+] Building 0.0s (0/0) docker:desktop-linux Launching interactive container. Press Ctrl-a-c to switch to monitor console Interactive container was restarted with process "l3x31fm3i08owokoxwtazeuuw". Press Ctrl-a-c to switch to the new container / #
As expected only the second one is debuggable (only second one opens shell as well). But from the output they print same messages about interactive containers and switching IO. It should be more clear that these are different types of errors, why first one does not create execution context and what runs in the shell of second one.
This is a follow-up to BuildKit debugging issue https://github.com/moby/buildkit/issues/1472 so we can discuss the UX and next steps for features like https://github.com/moby/buildkit/pull/2813 . https://github.com/moby/buildkit/issues/1472 is mostly implemented in BuildKit v0.9 with additional signaling patches in v0.10 so unless we missed something no more (breaking) BuildKit changes should be needed.
All of this work does not need to end up in buildx repository, and some may be moved out later. I don't want opinionated dev features in buildctl that should be vendor-agnostic test tool, and I also don't want to maintain two similar, but different debugging stacks. Aside of that, code reuse is encouraged.
BuildKit issue concentrated on internal building blocks for this feature and you should read it first. In here, I'm proposing steps for incremental PRs to end up with a user-friendly debugging feature.
All flag names up to later discussion. Naming is hard.
PRs may be combined where it makes sense for review, but all the described steps should be quite independent.
PR1
Add possibility to interactively run a process using the
NewContainer
API after the build has completed.The build will run with the progressbar until completion. Progressbar will finish and container will be launched. Container redirects all stdio and signals. TTY is enabled if user enabled TTY for main process.
PR2
Add "monitor mode" to the interactive process. When running an interactive process, the user can switch between to process io and monitor process io. Similar to QEMU monitor mode. In monitor mode they can issue additional commands.
In the very first PR only supported command may be "exit".
PR3
Add "rollback" command to monitor mode. With this command the user can make modifications in the interactive container and when they issue "rollback" command they are brought back to the initial state.
Add "reload" command to monitor mode. This will run the build again(now with possibly updated source) and invoke the shell again.
PR4
Refactor build command to invoke builds in a background process. This is important as we don't want the lifecycle of "debugging session" to be locked into a single process. A socket should be created under
~/.buildx
and even if the current process (unexpectedly) dies its state can be accessed again via the socket.PR5
Add
list
andattach
commands to the monitor mode. List would show all the current active sessions (via the socket described in the previous section). Attach would make that session active in current process. If the session is already active in another process these processes would detach and go to the monitor mode.PR6
Add
exec
command to execute new processes in the same debug container. All processes are attachable as described in the previous section.PR7
docker buildx build --invoke=debug-shell
should go directly to monitor mode where processes can be created withexec
. We can also havedocker buildx debug-shell
to start monitor mode without a specific container context.PR8
Add
docker buildx build --invoke=on-error
. In this mode, if build ends with an error debug shell will be opened from the error location. The error returned by buildkit is typed and contains the references to the state of the error and state in the beginning of the failed step. Monitor commands allow to switch in-between of these states. Error also includes source map that can be shown to the user.Next:
In the next steps we can write more specific proposals for:
Breakpoint debugger
There are two ways to approach this. As the builder is lazy we can call
Solve
that will return a result without actually evaluating it. This result can be converted to DefinitionOp which contains the source locations. Now this DefinitionOp can be mutated to build up the the breakpoint and then evaluated and run interactively. I think this is similar to wrapper in buildg without requiring buildkit update or proxy. The problem with this one is the cases when frontend does multipleSolve
calls (maybe in parallel). This would start to conflict with the debugger logic. Therefore I think a better approach could be to define this as frontend capability and send breakpoint info with build opts to the frontend. If frontend allows debugging it would stop on the breakpoint and return the result that the debugger will then show.Monitor mode
Add more functions to monitor mode. Eg. commands to inspect file layouts, transfer files. Keep shell history between invocations.
Buildx bake
Enable interactive sessions in
buildx bake
. Eg.docker buildx bake dev
could build all the images part of the project, run them, and put the user into a dev container. Inside the dev container they can switch between active processes etc.@ktock @crazy-max