Closed ckyrouac closed 4 months ago
@cgwalters This seems to behave the same as using os.Stdin on my machine. I don't see any progress bar either way though, so maybe I'm doing something wrong.
When using os.Stdin, the initial ssh connection is busted until sending a newline. This is the error:
ERRO[0110] Failed to write input to service: write unix @->/home/chris/.local/share/containers/podman/machine/qemu/podman.soc
k: write: broken pipe
Argh, sorry for the regression here!
I don't see any progress bar either way though, so maybe I'm doing something wrong.
You need bootc built from git main with the changes from https://github.com/containers/bootc/pull/655
This is effectively reverting https://github.com/containers/podman-bootc/pull/55 right? I think nil
is equivalent to /dev/null
.
Hmmm....this makes me think that what's going wrong here is we're not correctly detaching from the container? There's a whole lot going on in podman attach.go
.
Tangentially this code is IMO another great example of the superficial simplicity of Go being paid down later...trying to use nil
for "unset value" except that comes back to bite with "Note that an interface value that holds a nil concrete value is itself non-nil. ". In Rust it'd just be Option<T>
...
@cgwalters This seems to behave the same as using os.Stdin on my machine. I don't see any progress bar either way though, so maybe I'm doing something wrong.
When using os.Stdin, the initial ssh connection is busted until sending a newline. This is the error:
ERRO[0110] Failed to write input to service: write unix @->/home/chris/.local/share/containers/podman/machine/qemu/podman.soc k: write: broken pipe
I didn't see that when I tested it, it's true that the e2e test don't run on my machine, for some reason it got killed by the OOM killer
Tangentially this code is IMO another great example of the superficial simplicity of Go being paid down later...trying to use
nil
for "unset value" except that comes back to bite with "Note that an interface value that holds a nil concrete value is itself non-nil. ". In Rust it'd just beOption<T>
...
A famous question asked to the Go team at a conference: "Why did you choose to ignore any research about type systems since the 1970s"
I briefly dug at this a bit more, and what I think is going on is again a classic Go problem¹. Basically, the Attach
API spawns a bunch of goroutines, such as this one which copies stdin.
In theory, us cancelling the context should terminate all work spawned from the Attach
API...but it doesn't, because goroutines are "stackful" and famously not easily cancellable. And so the Attach API just leaks these goroutines.
If one just invokes podman run --rm -ti someimage somecommand
(which is what we're replicating in custom code) the basic fact is that when the podman
process exits all its goroutines and process state are torn down reliably by the kernel, papering over this leak.
I'm pretty tempted to just switch back to forking off podman run
here. It's not clear to me that using the Go API bindings to the HTTP API is buying us a lot of value here.
That said, I think we could fix the podman code here by changing it to force close the socket when the context is cancelled. That's the usual way to cancel reads in Go. It's still fundamentally racy though without going to the extra level of having something like a "cancellation complete" channel in this API that synchronously waits for the worker goroutines to exit though.
¹ Yes, yes it's a poor craftsman who blames their tools and to be honest, asynchronous Rust also has its share of traps - and that trap is very much precisely because of how cancellation of an async task is just a simple "drop"...but, at least it's pretty straightforward to audit Rust codebases for use of select
.
https://github.com/containers/podman-bootc/pull/61 is an alternative fix
I'm closing this since #61 supersedes it
When using os.Stdin, the initial ssh connection pipe is broken.