Closed jameslamb closed 4 months ago
+1 for this, except make runOptions
a list of arguments passed directly to docker run
, .e.g runOptions: [--network host --gpus all]
. This provides maximum flexibility and maps directly to the CLI.
This seems like a good goal to me.
One thing I want to avoid is leaking the implementation details around canary
using docker
under the hood. I tried to build things in a way where we could support podman
or nerdctl
(or whatever comes next) in the future. So I don't want to add anything docker
specific to the configuration. In this case I would lean towards option 1 because it both copies the Kubernetes API and also allows it to be implemented under the hood for each runtime.
However, I've seen other projects like kind
try and maintain the same kind of disconnect between the user API and the implementation and result in things being hard or impossible (like setting the --gpus=all
flag). So I don't want the goal of implementation detail purity to work against us.
I also quite like option 3 because it would just allow passing arbitrary flags to the CLI under the hood, and I like @trxcllnt's suggestion of making it a simple list. However I worry a little about a future where we want to support podman
and all of these manifests have docker
specific flags set in them.
Perhaps the way to deal with this us to just expose the implementation directly and have something like dockerRunOptions
which allows is to add podmanRunOptions
in the future. The only downside to this is that people will end up making manifests that are tailored to a certain runtime and will likely not work with a different one.
Any thoughts on a solution to this?
Perhaps the way to deal with this us to just expose the implementation directly and have something like
dockerRunOptions
which allows is to addpodmanRunOptions
in the future
I think we should pursue this dockerRunOptions
approach, and making it a list (instead of a map) is fine with me.
The idea to namespace by container runtime is a good one. There are only a small number of container runtimes I can imagine this tool ever needing to support, so the proliferation of those {runtime}RunOptions
shouldn't be too bad.
And with all of this defined in static configuration, there's a path available to use warnings / errors to move configs over to top-level (portable-across-runtimes) options as new ones are added. Similar to the the example I gave above, by raising warnings / errors like this:
"[WARN] Passing --gpus via dockerRunOptions
is not allowed. Please use top-level key gpus:
instead".
I like this the most because it allows for experimentation. In the absence of this, right now to test if --network host
even does actually fix the particular CI that inspired this issue, I'd have to modify container-canary
's source code, rebuild it, and then use that built-from-source version to test. With the ability to pass through arbitrary flags, that's no longer necessary.
That sounds totally reasonable to me. Let's go with option 3 and in the future move things that we know people want to option 1 for better portability.
I just published v0.4.0
which includes option 3.
Excellent, thank you so much!
Description
I'd like to be able to convince
container-canary
to run containers with--network host
.@trxcllnt and I have been using
container-canary
in a docker-in-docker context in CI, and he observed some networking instability in that environment that might be be resolved by avoiding double-nested network isolation.Benefits of this work
Allows finer-grained control over how
container-canary
is run.If this does seem to help for the docker-in-docker case specifically, would improve
container-canary
's usability in a common pattern used in CI.Acceptance Criteria
container-canary
pass--network host
todocker run
Approach
Options Considered
I see 3 options for exposing this configuration, @jacobtomlinson I'll happily defer to your opinion on which to pursue (assuming you're open to this idea generally).
Option 1: hostNetwork
The k8s V1 podSpec has an option
hostNetwork
, for exactly this purpose. (k8s docs).So that'd look like the
container-canary
config picking up a new boolean option:Option 2: named network argument
container-canary
could instead support passing any network name via a top-level config argument.This would allow greater customization, of the form described in "User-defined networks" (Docker docs).
Option 3: arbitrary array of extra arguments
container-canary
could support passing arbitrary flags to the container runtime via an array.Where the implication is that for each
key: value
pair there,{key} {value}
is passed todocker run
. Similar to this:https://github.com/NVIDIA/container-canary/blob/c47389618e547db3d572ddfd62e3357449880ec9/internal/container/docker.go#L41-L47
GitHub Actions has a similar interface, where there are some container characteristics that are set via top-level configuration, and then any other arbitrary flags can be passed via
jobs.<job_id>.container.options
(GitHub Actions docs)Proposal
I (@jameslamb) personally prefer option 3. It offers high flexibility to experiment, so could add a lot of capability to
container-canary
all at once (beyond just--network host
).For example, it'd immediately also add support for running tests that require GPUs (you could just pass
--gpus all)
.And having the arguments be keys in the YAML map means that if, in the future, we wanted to have
container-canary
prohibit setting some option or require that it be set by some other top-level configuration value, it'd be straightforward to issue deprecation warnings and eventually errors like "Passing--gpus
viarunOptions
is now allowed. Please use top-level keygpus:
instead".