projectatomic / atomicapp

[UNMAINTAINED] This is the reference implementation of the Nulecule container application Specification: Atomic App
102 stars 71 forks source link

Don't unpack files to /atomicapp during run #142

Closed aweiteka closed 8 years ago

aweiteka commented 9 years ago

Based on user feedback, it is unexpected that a bunch of files show up in current working directory when running an atomic app. This includes visible files as well as the hidden .workdir/ directory. It is expected during install. I propose we use a temporary directory like /tmp for these things and only write files during install.

aweiteka commented 9 years ago

Related to #136

vpavlin commented 9 years ago

Seems like we are reverting the decision we made about the .workdir some time ago:) Anyway - I think we write only the templated artifacts on run - the rest is written on install. And we cannot prepare artifacts on install as user might want to change answers.

Writing those artifacts in /tmp is problematic, because the /tmp lives in the container thus we would need to mount some directory there..

nzwulfin commented 9 years ago

Based on #136 I did some more digging on containerized atomicapps (the nulecule examples in particular) and came up with a "namespaced" solution with LABELs.

LABEL STOP docker run -it --rm --privileged --net=host -v `pwd`/Nulecule:/atomicapp -v /run:/run -v /:/host --name NAME -e NAME=NAME -e IMAGE=IMAGE IMAGE -v stop IMAGE
LABEL RUN  docker run -it --rm --privileged --net=host -v `pwd`/Nulecule:/atomicapp -v /run:/run -v /:/host --name NAME -e NAME=NAME -e IMAGE=IMAGE IMAGE -v run IMAGE
LABEL INSTALL docker run --rm -it --privileged -v /run:/run -v `pwd`/Nulecule:/atomicapp -e IMAGE=IMAGE -e NAME=NAME --name NAME IMAGE -v install --destination IMAGE /application-entity

This block added to helloapache will create a Nulecule/nzwulfin/helloapache structure in the current working directory. This feels better than have the Dockerfile just drop files in PWD, but could rely just on the atomicapp functionality of run IMAGE to create the target directory for the app.

I have to use install before run with this label set. If this is interesting to other folks, the branch I'm working on is here (https://github.com/nzwulfin/nulecule/tree/nulecule-work-dir)

nzwulfin commented 9 years ago

I've looked at a few more methods to encapsulate an app. What I've run into is that every method of encapsulation aside from creating some sort of working directory on the host runs into the issue @vpavlin brings up about answer files.

Unless someone knows a better way to inject a single file at run time into an existing container directory. Everything I tested either happens too late (docker cp) or replaces the whole volume. atomicapp taking an option for alt locations of answer files doesn't help since that's in the LABEL and would need a rebuild.

It looks like either:

vpavlin commented 9 years ago

Random thought & what if..

We will need/have a REST API that will enable us to control atomicapp through this API instead of by running commands. We already know Cockpit will need it, I think Foreman and other management solutions can benefit from it. So what if, once we have that API, we create a client (or add this to atomic directly?) which will be able to load answers.conf from anywhere and pass it to the running instance of atomicapp in container through that API.

This breaks the use case we are creating here - anyone can download and run the app without having anything installed locally. But we can still preserve current behaviour for such users where more "serious" users would be able to get rid of these writes to the PWD.

Also there could be a client (in go, maybe?) bundled in the atomicapp image where this would be unpacked to a well known location when "atomic install" is run.

Does it make sense?

nzwulfin commented 9 years ago

I do like the idea of an API, but I think the nothing installed locally other than /usr/bin/atomic is something to preserve. FYI, This is a bit of a long read.

To that end, I'm proposing something I'll leave to the team to determine if it's brilliant or crazy :8ball: I'm putting some on the end admin to maintain a local container and the spec for the atomicapp container has to be fairly strict.

There are two branches I've put together and 3 changes.

You can look at the work here https://github.com/nzwulfin/atomicapp/tree/onbuild https://github.com/nzwulfin/nulecule/tree/onbuild

1: Change the atomicapp container to use ONBUILD and standard locations

The atomicapp container currently uses /application-entity as the runtime location. Downstream containers add application information and artifacts in their Dockerfiles. That location is modifiable by changing where application data gets stored and overriding the LABELs for run and install. I propose we mandate that location and not allow overrides by containers built from the atomicapp image.

The ONBUILD Dockerfile directive allows for triggered actions to be taken on the creation of a child image from a parent, e.g. a Nulecule app from the atomicapp container. If we standardize the locations of application added information, we can moved the ADD directives that currently exist in the child Nulecule app Dockerfiles to the parent atomicapp Dockerfile.

ONBUILD ADD . /application-entity/

When a child app is built, docker will then run the ADD directives and deliver the local application content to the correct container locations.

Sending build context to Docker daemon 
Step 0 : FROM nzwulfin/atomicapp
# Executing 2 build triggers
Trigger 0, ADD . /application-entity/
Step 0 : ADD . /application-entity/
Trigger 1, ADD /artifacts /application-entity/artifacts
Step 0 : ADD /artifacts /application-entity/artifacts

2: Use bare Dockerfiles with overrides for vendor containers

The local working directory for a child Nulecule app built for distribution would look the same as the current local working directory. The Dockerfile would only have those things that need overriding, MAINTAINER, etc. The directories would be mandated by the atomicapp container spec as /application-entity and /application-entity/artifacts. Since all of the files are added to the container, there's no on host unpacking.

The other required artifact is a published complete answers.conf. All available variables should be commented out with a section that has the appropriate names for the graph.

3: Use answers.conf Dockerfiles for local overrides and containers

Should an admin want to create an application non-interactively, the admin would use a Dockerfile to create a local child application container. This container would ADD the answers.conf to the /application-entity directory in the container. This Dockerfile would also be extremely simple, using the vendor container as the FROM image.

I've got 3 containers in my Hub account that seem to be working correctly using this model.

nzwulfin commented 9 years ago

pinging @markllama as this may have relevance to https://github.com/projectatomic/atomicapp/issues/159

markllama commented 9 years ago

There's a lot here to like. Without fully understanding the implications yet, I have a couple of impressions.

I like the standardization of patterns, and the ONBUILD which makes creating custom atomicapps easier and clearer.

I recognize the need to embed configs by adding a layer but it really tastes bad to me. Lacking a better mechanism it will have to do.

I think I want to lay out an workflow for building and using atomicapps that might clarify how it relates to atomic host and the atomic CLI command, but I think we need to formulate it outside these comments.

aweiteka commented 9 years ago

@nzwulfin really interesting. Thanks for taking the time to spell this out.

aweiteka commented 9 years ago

I've been assigned to get this sorted out so I'm firing the conversation back up.

ONBUILD makes a lot of sense. Let's do it. I can be done independently, I think. It doesn't address the "How should install vs run work" question.

Here's what I think makes the most sense from an end-user perspective:

Thoughts?

cc @vpavlin @nzwulfin @goern

goern commented 9 years ago

pretty much in line with https://github.com/projectatomic/nulecule/issues/139

re ONBUILD +1 I transformed that into #195

aweiteka commented 9 years ago

With the ability to append args in atomic CLI (see PR#110) we can consider a more streamlined approach that doesn't use install at all. The user only needs to run, stop and update applications. Here are the use cases:

jasonbrooks commented 9 years ago

I think this is much clearer.

dustymabe commented 9 years ago

I like this better than what it was. I think there can be some more clarification though.

This:

Should probably be this:

Additionally I don't think run with --dry-run is really a good way to say I want to download template files especially because (I think) template files won't normally be downloaded when a user normally does an atomic run. In that case --dry-run doesn't really mean dry run, it really means get files.

This:

Should probably be this:

stefwalter commented 9 years ago

@sub-mod CC

Could you guys ensure that this new scheme has a version exclusion gate? So that the caller can provide an option like --version-required=2 so that the caller can specify that it expects to interact with a meta container that has the specific version of the nulecule code?

Otherwise each time you rewrite everything over again ... code has no idea how to deal with it. Usually we have package (ie: RPM) version dependencies to help deal with this sorta mess, but is not present with privileged containers.

ncoghlan commented 9 years ago

I'd never properly understood the install vs run distinction, so +1 for removing the need for it entirely.

Also +1 for @dustymabe's suggestion of a separate command for downloading the files without running anything - I'd suggest "atomic download" or "atomic fetch" over "atomic getfiles", but the key is making it a separate verb rather than a --dry-run modifier on "atomic run".

aweiteka commented 9 years ago

I agree with the separate verb. One issue is from an atomic CLI perspective, these are the LABELs we have available to us.

help Command to run the help command of the image
run Command to run the image
uninstall Command to uninstall the image
install Command to install the image
stop Command to execute before stopping container
debug Command to run the image with debugging turned on

Given this list I don't see how we can practically use a download or fetch or getfiles verb. Will another option suffice? --dry-run --getfiles or --dry-run --gen-answerfile?

sub-mod commented 9 years ago

@aweiteka I suggest keeping the verbs install run etc in atomicapp and exploring the opt1, opt2, opt3 options available in atomic command to support adding more sub-commands for install/run in atomicapp for ex: atomic install --opt3="--getfiles" IMAGE becomes atomicapp install --getfiles IMAGE

please take a look at https://github.com/projectatomic/atomicapp/pull/152 append is not the only way to add/inject optional arguments into atomicapp command. OPT1, OPT2 is the other way opt changes are available in atomic-1.1.1 rpm

dustymabe commented 9 years ago

@ncoghlan I like fetch as well. getfiles was only because I couldn't think of something better.

vpavlin commented 9 years ago

Reading through this I am getting a bit confused. Do I understand it correctly from @aweiteka's and @dustymabe's suggestions that we are getting rid of install? Because install is currently where answers.conf.sample is created, thus there is no reason to do atomicapp run --dry-run as install does not run anything anyway...

install was intended as a preparation step where all dependencies are pulled. Replacing it with run --dry-run doesn't make much sense to me because run performs the templating of artifacts which is absolutely worthless as we expect user to modify answers.conf(.sample) first.

If we want to omit install and use run for all, I'd suggest to change labels so that user will have to provide --optX with Atomic App command. i.e.

atomic run --opt2=fetch projectatomic/helloworld
docker run ... projectatomic/helloworld fetch /atomicapp

So we basically replace command with $OPT2 and /atomicapp can become a simlink to (/host)/tmp/atomicapp/helloworld-$imageid. This would allow us to add any commands for our workflow we need/want. We could potentially also somehow default to run if there is no --opt2 provided.

Ideas?

nzwulfin commented 9 years ago

I'm of the opinion that we leave install in place and not overload run with a lot of extra options. While install may not be the best verb, it will be more difficult to get a new LABEL to the container spec if we are the only users of the LABEL. There also seems to be consensus that --dry-run actually creating an object that the user is expected to interact with isn't an expected result of a "dry run".

If we can better explain what "installing" an atomicapp means in terms of run preparation, I think we can avoid the confusion @markllama brought up in https://github.com/projectatomic/atomicapp/issues/159 and that started this particular issue. I see this as a documentation issue not a functional one. We also should document what is downloaded and unpacked, (Dockerfile, Dockerfile and Nulecule, Dockerfile, Nulecule, and app code?), what happens if you modify these, how to run the modified app.

Using the OPT* is interesting overall, but seems too dependent on arbitrary position used by the Dockerfile writer. If someone overloads the LABEL, we could see unexpected behavior. I could also be overthinking that.

I'm for keeping install behavior as the preparatory verb, with a better explanation of the actions performed on the docker host to set user expectations.

ncoghlan commented 9 years ago

"install" inherently means mutating the host system to me. What command do I run to fetch all the artifacts without affecting the host system?

nzwulfin commented 9 years ago

I'm not sure I get the use case for fetching all the artfacts without the intention of "mutating" some host. Also, which host are we discussing when talking about containers, the atomic host, the resulting container "host"?

Today there's no projects I'm aware of that provides non- image artifacts, and that means we've got an uphill battle to add a new label to the generic spec just for our sensibilities. On Aug 12, 2015 10:38 PM, "ncoghlan" notifications@github.com wrote:

"install" inherently means mutating the host system to me. What command do I run to fetch all the artifacts without affecting the host system?

— Reply to this email directly or view it on GitHub https://github.com/projectatomic/atomicapp/issues/142#issuecomment-130510772 .

aweiteka commented 9 years ago

We could propose another LABEL specified for this purpose but I'm not convinced it's the right thing to do.

The "host" in question is the one initiating the action, the workstation or server that kicks off the command. Depending on the deployment target, the end-user may explicitly not want to alter their workstation. Rather, they only want to alter the remote cluster. So in many cases the intent of "install" means to "prep the remote machine but don't initiate the service."

The issue with the --opt* syntax is it's awkward and not intuitive. It feels like a hack and it basically is.

That said, I think it's best to keep INSTALL for now. Let's make it clear in the tool, docs and our demonstrations what it does. Let's focus our energies on all the useful options to RUN that we all care about that really impact UX.

dustymabe commented 9 years ago

So are we still going to unpack files to the current working directory? I often do atomic run helloapache followed by atomic run <something else>. If you do that and everything unpacks to the current directory then things get confusing quickly.

We could still detect if answers.conf exists in the current directory and use it, but other than that put everything in the temp directory.

aweiteka commented 9 years ago

@dustymabe right, we don't want to unpack to current workdir unless the user specifies that directory, overriding the tmpdir.

Sounds like we're on the same page.

vpavlin commented 9 years ago

So are we going to mount /tmp and create/find a specific directory there?

ncoghlan commented 9 years ago

On 27 August 2015 at 17:09, Vaclav Pavlin notifications@github.com wrote:

So are we going to mount /tmp and create/find a specific directory there?

In trying to get the Pulp local demo playing nice with host-owned directories, I'm honestly starting to think the better option is to automatically create a data container that exports the necessary volumes, and link the volumes from there. That leaves the host system clean, and gets you away from an SELinux nightmare.

Regards, Nick.

Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

cctrieloff commented 9 years ago

For reference:

https://docs.google.com/document/d/17LVMS7vLaLbhfFeVenlcChsGWjBOmw1B44I63j7Sf1k/edit#heading=h.y9ygf9v7vdie

cdrage commented 9 years ago

@cctrieloff can you make the file public?

dustymabe commented 8 years ago

I think this will be taken care of by PR #278. It should be merge shortly and we can confirm before beta-3.

dustymabe commented 8 years ago

This is done by 7d4dbab. Will close once beta3 branch is merged to master.