Closed mumoshu closed 6 years ago
Just wrote a lengthy issue at #138 trying to explaining why I (and possibly you) need this. @rhs Would you mind reviewing this? Thanks!
Hi,
First off, I think the use case makes a lot of sense. Thanks for taking the time to describe it in such detail, and thanks for contributing a PR. I have a couple general comments/question I'll put here and then follow up with any more specific comments in the code.
I'm wondering if there are any other approaches to solving this problem you've considered? Maybe using a multi stage docker build? I ask for a couple of reasons:
1) I'm curious as I expect we will need to solve this also. ;-)
2) A big goal of forge is to work consistently across as many different environments as possible. For this reason I'm generally cautious about adding external dependencies in a way that can make forge behavior brittle/environmentally dependent. Right now the primary external dependencies forge has are kubectl and docker, which are both relatively stable/known quantities, so I have to ask the question of whether or not the use case can be reasonably accommodated without expanding the external dependency footprint. Alternatively, if it can't be, then I'd like to understand what imagebuilder stability is like, e.g. can we reasonably expect everyone to have a certain minimum version? Should we have a version check in place so if there are incompatibilities we fail fast? Is it still an early project in rapid development? (FYI, I'm not personally familiar with imagebuilder, so I'd appreciate any context you could give on these questions.)
My only other big comment is on tests. FWIW, I'm aware the tests don't run properly if you aren't in the datawire organization. I'm working on improving that, hopefully soon. I'd like to be a) easy to run the full test suite and b) easy to contribute tests. I need to finish vetting the credentials we use from CI and write a bit more documentation to make this a reality, but for now if you could supply a sample project (could be tarball/zip/etc, or a pointer to a git repo) and a simple transcript of usage along with expected output, I can turn that into a test quite easily.
@rhs Hi, thanks for reviewing!
I'm wondering if there are any other approaches to solving this problem you've considered?
I've updated the lengthy description at #138.
The only other approach in my mind is the one I've mentioned in 4. Unreadable, relatively complicated script to form a docker-image-build pipeline
.
It is occasionally implemented w/ lengthy bash scripts and/or Makefiles chaining docker-runs and docker-builds to finally produce the docker image for your app.
Maybe using a multi stage docker build?
I've tried and then abandoned it for my business apps(note that it is still useful for easily building efficient docker images for OSS)
Please see Why not docker multi-stage builds?
in #138 for more info.
In nutshell, it requires a trade-off between slow builds vs leaking secrets to image layers/metadata. IMHO this is an unnecessary trade-off and hence this PR.
@rhs
Some more alternatives I remember:
It doesn't work everywhere though.
As it requires you to have access to the host running the docker daemon running your docker-build, it won't work w/ CircleCI 2.0 and Travis CI's container executor for example. Would work w/ respective machine executor which runs your CI build on a dedicated VM, but it delays startup of the CI job in the first place. Again, IMHO this is an unnecessary trade-off. Just use client-side docker image building.
while fetching secrets from within docker-build contexts.
It would be an option if you run everything from CI&CD to apps on your own VMs.
In case you use a SaaS like CircleCI for CI, it implies that you have to directly/or indirectly(via VPN for example) expose Vault to the Internet so that your docker-build contexts running on CircleCI's infra can access the Vault. This is a big no-no for me. The less attack-surface is the better security is.
Also note that both alternatives requires you to install a command into the docker image in order to obtain secrets from the docker-ssh-exec server or Vault. It slightly complicates Dockerfile.
And again, this is an unnecessary complication IMHO. For example in imagebuilder, you can just mount a volume w/ secrets to a docker-build context, which leaves nothing in the resulting image.
@rhs
A big goal of forge is to work consistently across as many different environments as possible. For this reason I'm generally cautious about adding external dependencies in a way that can make forge behavior brittle/environmentally dependent.
I definitely agree and understand that.
Right now the primary external dependencies forge has are kubectl and docker, which are both relatively stable/known quantities, so I have to ask the question of whether or not the use case can be reasonably accommodated without expanding the external dependency footprint.
Ack. I'm ok with just introducing something like:
containers:
- image:
builder:
command: your-preferred-builder
so that forge calls your-preferred-builder -f $dockerfile -t $img $build_args $dir
instead of the default docker build
. The user places the docker-build compatible script/command to instruct forge to run it instead of docker-build. How about that?
Alternatively, if it can't be, then I'd like to understand what imagebuilder stability is like, e.g. can we reasonably expect everyone to have a certain minimum version? Should we have a version check in place so if there are incompatibilities we fail fast?
I wish we had one but no - it doesn't even have a semantic version yet.
Is it still an early project in rapid development? (FYI, I'm not personally familiar with imagebuilder, so I'd appreciate any context you could give on these questions.)
Honestly I'm also new to the party but AFAICS:
oc
's code-base - the cli for OpenShift Origin - but extracted and has been maintained in the dedicated repo.oc
. if you could supply a sample project (could be tarball/zip/etc, or a pointer to a git repo) and a simple transcript of usage along with expected output, I can turn that into a test quite easily.
You can just add the following snippet to the service.yaml of your hello-world-python example:
#snip
containers:
- image_builder: imagebuilder
#snip
I've implemented the alternative in https://github.com/datawire/forge/pull/137#discussion_r163450263
A config would look like:
containers:
- image:
builder: imagebuilder
now.
If you are still not wanting imagebuilder as forge's dependency, I can also remove it and try this:
containers:
- image:
builder:
command: your-preferred-builder
wdyt?
Thanks for detailing the alternatives, that's super useful info to have!
I'm fine adding the imagebuilder dependency, I'm in the process of pulling it into the dev container now. If it presents a stability problem we can cross that bridge when we come to it. I'll try to get this into a release by tomorrow or maybe early next week if I hit any issues.
@rhs Thanks again for taking your time!
Although I have not yet managed to run the whole test because it seems to require a running kubenaut env, I've added some unit tests to test_service.py
and test_schema.py
while fixing a regression.
Hope you like it.
FWIW, test_schema.py
and test_service.py
are passing on my machine.
I added an end-to-end test with imagebuilder and merged this into master, so it should go out into the next release. Please note that I tweaked the schema a little bit. I was worried that having image:
and name:
would be confusing, and I wasn't sure what the point of having the extra object was, so I pulled builder:
up to the top level. If I missed something here and there is good reason for the extra object then please let me know.
@rhs Hi, thanks for your efforts! Regarding the point, please let me clarify a bit. Not arguing, but thought it could be good for us to sync up.
Probably we have a bit different concepts in our minds referring the same terms.
For me, containers[].name
sounds like its solely for container's name, irrelevant to container image name. I'd even expect that we could customize an image name regardless a container name, e.g. via containers[].image.name
.
However, as I understand the forge's "convention" to name the container image the same as the container(which is a different concept than container image), it is understandable.
In my perspective though, containers[].builder
is misleading. I'm not building a "container", but a "container image". That's why containers[].image.builder
or containers[].image_builder
sounded more natural to me.
imagebuilder is an alternative to
docker build
for building secure(A volume mount inside a build context, no need to leave secrets inside the docker image/metadata in case you havebundle install
ornpm install
or etc. to install packages hosted in private Git repos) and small docker images(w/ squashing) in a faster way(No context upload).This adds a imagebuilder support to forge, so that it can leverage imagebuilder for secure, fast, smaller image builds.
Spec
A new setting key
image_builder
is added undercontainers
in service.yaml, which can be omitted or set to eitherdocker
orimagebuilder
.If the value was omitted or set to
docker
,docker build
is used for building docker images. This is the existing and default behavior before/after this change.If the value was set to
imagebuilder
, theimagebuilder
is used instead.Notes
The only gotcha is that to support
--build-arg
s forimagebuilder
, you need https://github.com/openshift/imagebuilder/pull/58 until it gets merged.