moby / moby

The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems
https://mobyproject.org/
Apache License 2.0
68.45k stars 18.62k forks source link

Add support for specifying .dockerignore file with -i/--ignore #12886

Closed tristanz closed 5 years ago

tristanz commented 9 years ago

As several people have mentioned (@thaJeztah, @duglin) in #9707, it would be great to be able specify the .dockerignore file using -i/--ignore in conjunction with named dockerfiles. It is often difficult to use named dockerfiles because the build context becomes too large.

runcom commented 9 years ago

dibs or sorry @tristanz are you already working on it?

duglin commented 9 years ago

To be clear, I didn't say I wanted it - just that it could be supported. You may want to see if there's buy-in from the team first before you implement it.

duglin commented 9 years ago

@runcom ^^^^

runcom commented 9 years ago

roger that!

tristanz commented 9 years ago

I'm not working on about it.

It seems like a natural option given -f, since -f makes it much harder to use .dockerignore effectively. Without -i it is impossible to use -f for our repo because our build context is 1GB+. We have pretty classic setup: single repo with many services that share code. We'd like to be able to build all these services without custom scripting or slow builds.

thaJeztah commented 9 years ago

Not a maintainer (for this), I'd be +1 for this so that different Dockerfiles can exclude different parts of the build-context.

I'm not sure having a single-char flag (-i) is needed, perhaps --ignore-file (to match, e.g. --env-file).

@tristanz Perhaps you can describe some use-cases for this feature, it may help the maintainers make a well-founded decision whether or not this is something that's wanted

edit: Our comments just "crossed", I see you just added some description on your use-case

tristanz commented 9 years ago

Yes, I believe our use case is exactly the use case described in the named dockerfile thread: a single repo with many services that share libraries. So this is just trying to make that same use case work for real setups. Single repos like this tend to be large (which we love for many reasons!), so without --ignore-file we can't use -f and are forced to resort to prebuild scripts.

duglin commented 9 years ago

@tristanz can you give a little more detail on these large repos? In particular, when I think about large repos with multiple sub-projects, I think to think of it as each project having their own sub-dir with their own Dockerfile but they all share a common dir that might be a sibling to each project. For example:

/common
/project1
/project2

and in these cases I wonder if supporting symbolic links wouldn't be easier. Meaning, /project1/common is a symlink to /common.

Also, to be clear, I'm not for or against a --ignore flag - still trying to understand the usecases.

tristanz commented 9 years ago

@duglin that's my setup exactly. I actually would much prefer symlink support to solve this. It feels less docker specific, allows be to leave my dockerfiles unchanged (sub folder build context), and I can avoid writing lots of dockerignore files. I was just under the impression that named dockerfiles were the recommended alternative. I see symlinks as the better alternative by far.

thaJeztah commented 9 years ago

Are symlinks supported on Windows nowadays? (Wondering)

For my personal use-case, some containers are created using a volume to mount the source-code during development. Being able to exclude those files (huge number of files, not the size per-se) when starting the dev container will save lots of time. (And, yup, there are workarounds)

tristanz commented 9 years ago

I guess the main question is whether the really is no solution for multiple container git repos with shared code. This seems like a extremely common use case, but any discussion of symlinks or relative paths is not finding traction as far as I can tell.

Here's the folder structure I'm thinking about:

/docker-compose.yml
/container1
   - Dockerfile
/container2
   - Dockerfile
/shared

Is there really no docker way? The best I have found so far is to simply copy shared into multiple places.

tristanz commented 9 years ago

Is it worth working on a pull request for this feature?

thaJeztah commented 9 years ago

Is it worth working on a pull request for this feature?

It hasn't been decided on, yet, but you could try. With the chance it gets turned down :)

Sometimes discussing a PR helps making a decision, so..up to you, I guess :)

Ralle commented 9 years ago

Another solution would be to look for the .dockerignore file where the Dockerfile is stored as the build directory is not necessarily the directory of which the Dockerfile is, but I would also prefer -i.

thaJeztah commented 9 years ago

Another solution would be to look for the .dockerignore file where the Dockerfile is stored

Unfortunately, that won't work if multiple Dockerfiles are in the same directory (e.g. Dockerfile.dev and Dockerfile.prod)

Ralle commented 9 years ago

Right. Then how about a Dockerfile command to describe the corresponding .dockerignore?

thaJeztah commented 9 years ago

I think the proposed solution (a --ignore-file option) is the best approach, that's most flexible. Only thing needed now is to decide if this is something that is desirable enough to implement.

tristanz commented 9 years ago

There are two thing I dislike about named dockerfiles even with dockerignore.

  1. Dockerignore becomes needed everywhere.
  2. Lots of containers don't have shared dependencies, but it's not clear to users of my repository what the build context should be. It would be fine if the root context was used everywhere, but our repository has forks of public Dockefiles that assume local context only.

Dereferencing symlinks seems much more elegant. I can always see all dependencies in each folder, like single Dockerfile repositories. Security could be handled not allowing symlinks to escape the outside the local folder unless an environmental variable or flag is set.

tristanz commented 9 years ago

I take back my second point. A better convention would be to put Dockerfiles at the depth of the build context. So Dockerfile.web rather than web/Dockerfile. This would make the context clear enough.

In terms of this whole issue, is there a reason why the build context is the entire folder not simply files and folders that are explicitly ADDed (plus Dockerfile/.dockerignore)? This would make multiple .dockerignore mostly unnecessary.

Dereferencing symlinks also looks like a non-starter because there is already a defined behavior -- create the symlink rather than dereference it.

Ralle commented 9 years ago

I make a web project and have billions of files. Having ADD for each would create an immense amount of intermediate containers and I would need a script that built my Dockerfile for me.

tristanz commented 9 years ago

@ralle I don't think anything would change. You just add a folder like normal. I'm just suggesting avoiding tarring and uploading things you never add.

pikeas commented 9 years ago

+1 to, in order of preference:

1) Build context shouldn't send all files, as described by @tristanz above. To maintain backwards compatibility, the CLI flag could be --full-context defaulting to true. 2) Symlinks followed during build. 3) Custom .dockerignore with -i per this issue.

tristanz commented 9 years ago

I think 1) in @pikeas list is by far the most elegant, but it is currently not easy to implement since parent images may have ONBUILD instructions.

cusspvz commented 9 years ago

@tristanz main difference between web/Dockerfile and Dockerfile.web is that on the first one you don't have the ability to add files behind that path.

Dockerfile is awesome for straight builds, but it lacks support for important deploy options such as environment-related diffs...

@fernandoneto please check if this works for you:

DOCKERPATH=/path/to/docker/build/folder
DOCKERFILE=Dockerfile.production
DOCKERIGNORE=.dockerignore.production
tar -cvzf - --exclude-from="$DOCKERPATH/$DOCKERIGNORE" $DOCKERPATH | docker build -f $DOCKERFILE -
GordonTheTurtle commented 8 years ago

USER POLL

The best way to get notified when there are changes in this discussion is by clicking the Subscribe button in the top right.

The people listed below have appreciated your meaningfull discussion with a random +1:

@alenca @cusspvz @fernandoneto

jakirkham commented 8 years ago

Though I am not against having the .dockerignore file specified by an option, I find myself needing the following things:

mishak87 commented 8 years ago

I would like to claim this one along with improvements to ignore internals.

thaJeztah commented 8 years ago

thanks @mishak87

I'll ping @tiborvass here to check if this will conflict with upcoming changes in the builder, but feel free to ask him on IRC #docker-dev (he's "tibor" on IRC)

thaJeztah commented 8 years ago

@mishak87 I just chatted with @tiborvass and it should not be a problem w.r.t. upcoming builder changes. Be aware that there's no "final" decision made yet for this feature, so we can discuss and decide during "design review" of your PR.

One thing; I'd opt for not using a short flag (-i) and only the long one (--ignore or --ignore-file). We try to preserve short flags for frequently used options (and we can always add a short version later).

Thanks in advance for your contribution!!

mishak87 commented 8 years ago

@thaJeztah Thanks for the info, I will work on it the next week.

thaJeztah commented 8 years ago

Awesome! Thanks in advance @mishak87!

mrluc commented 8 years ago

This makes a lot of sense. Google led me here because I assumed that if you can specify a named Dockerfile you can specify a .dockerignore file.

For instance, if you do your building with a fat image from a Dockerfile.build, and then deploy against a skinny image from a Dockerfile, it makes sense that you'd like to have a .dockerignore.build as well that might ignore everything except the binary artifacts from the build step.

Only a minor annoyance given the size of my repos atm, but thank you @mishak87

LATER NOTE: because this comment has been here a while, I feel like I should update this to mention that even though this issue is legit, it's an aesthetic, 'dev UX' kind of issue.

If someone's stumbling across my comment in a StackOverflow kind of mood, because this is actually getting in the way of having separate build and run dockerfile/dockerignore pairs, don't forget that you can just put them in different directories. Ie, in your build script, after building against your 'build' Dockerfile/ignore and extracting the build artifacts, move the artifacts and the 'run' dockerfile/ignore into a separate toplevel (probably gitignored) binary_artifacts directory.

pshomov commented 8 years ago

+1

jorge07 commented 8 years ago

+1

5c077yP commented 8 years ago

+1

shanemcd commented 8 years ago

Any updates on this?

mishak87 commented 8 years ago

@thaJeztah I feel like my PR got stuck. Should I rewrite it to make it work with latest changes? I could make time for it by end of September.

jonywtf commented 7 years ago

+1

tkeeler33 commented 7 years ago

+1

gittycat commented 7 years ago

Let's assume 1) above where the whole context is not sent by default, couldn't the context then be derived from what's in ADD and COPY statements? No need for .dockerignore.

Alternatively (keeping things compatible, ie: sending the whole context by default), since a dockerignore seems to be specific to a Dockerfile, we could specify the dockerignore to use in the Dockerfile itself.

BTW, I am hitting the problem of the one .dockerignore with two Dockerfiles (one for building my app, the other for creating the production image). I will probably have to create a subdir just to build from the second Dockerfile. That's the simplest solution I can think of without a way to specify a different dockerignore for each Dockerfile.

tarikjn commented 7 years ago

@gittycat I like your approach, maybe both can be combined? ADD/COPY specify files to add to the context, .dockerignore what to ignore in them, seems like they have slightly different roles. This has the added benefit of improving performance and security of existing docker builds.

ghost commented 7 years ago

Any traction on this?

askoretskiy commented 7 years ago

Having a single .dockerignore is annoying when you build multiple images via docker but really a problem once you use docker-compose. Then you cannot juggle with symlinks any more.

Moreover, docker-compose has a exactly this bug and cannot implement it till docker support this feature -- https://github.com/docker/compose/issues/2098

I would agree with other colleagues that .dockerignore would be really optional if docker generates context out of COPY and ADD commands from Dockerfile, ignoring all the files that are not there.

wclr commented 7 years ago

also --ignore option should take multiple ignorefiles

bryanlarsen commented 7 years ago

The lack of this feature prevents us from running docker builds in parallel while doing CI.

Our test scripts starts out something like this:

cp foo/dockerignore .dockerignore
docker build -f foo/Dockerfile .
cp bar/dockerignore .dockerignore
docker build -f bar/Dockerfile .
...

The builds could happen in parallel for a massive speedup. Slow CI is much less useful. :(

A common .dockerignore would result in a ~10GB context.

thaJeztah commented 7 years ago

This feature was put on hold, pending other changes that are being researched/worked on; one of those is making the builder "smarter" when sending the build context; see https://github.com/docker/docker/issues/31829

enkoder commented 7 years ago

Any update on when this will be worked on? This would solve a LOT of my problems when working in a monolithic repo.

tzickel commented 7 years ago

I have encountered this issue just today as well. I've spent time reading all of this discussion and most of the related PR.

My opinion is that there lots of radical ways to improve building in general, but today we have a very nice and simple way to declare some files to be ignored. There is just one glaring issue, while we can name the Dockerfile however we want (and thus allow building different dockers from the same directory), we cannot name the .dockerignore however we want and thus most of us end up sending tons of stuff to the context, that we don't need thus loading the network + build times unnecessary.

I do not understand why a simple PR to just add one option (which should have been done with -f), --ignore-file (even if it's one) conflict with future work on adding other ways to change the context (I assume that due to backwards compatibility .dockerignore will always be an option and thus people will always want the option to rename it).

mxl commented 7 years ago

@thaJeztah Could you reopen #18754? I see that you wrote that this feature was put on hold because of #31829 and other changes. But if you do not intend to remove ignoring files from build context with .dockerignore then I do not see how this feature conflicts with other changes. It's so frustrating that we are limited to .dockerignore file name and need to mess with scripts and using docker-compose is just wasting time and resources without option to use different .dockerignore files for several services with large shared build context. #18754 is only 32 lines of code and it could already help many people during these two years. I agree with @tzickel - it's very small and very effective change that require no hard work and large proposals.

cusspvz commented 7 years ago

@thaJeztah could you please point which file handles the creation of the tar by the cli?