bazooka-ci / bazooka

Continuous Integration and Continuous Deployment Server
http://docs.bazooka-ci.io/
MIT License
60 stars 5 forks source link

Reuse the SCM checkout between builds #145

Closed jawher closed 9 years ago

jawher commented 9 years ago

Right now, we preform a brand new scm checkout at every job start.

This is not ideal for large repos (tens to hundreds of megabytes).

We could benefit from finding a way to do an SCM checkout only once and perform updates on top of it for the following builds.

jawher commented 9 years ago

I started looking into this, and the way I see it, it will require some non trivial reworking of bazooka internals.

The idea is to hava a single source directory for the whole project and not one per job. Orchestration will start the scm fetch image with a flag saying it should update instead of a fresh checkout. if the update fails, a fresh checkout is then to be performed.

This poses the following problems:

Directory layout

A project level source directory will be outside the job directory (which is the docker context), and so cannot be added to the image.

Symlinking is not, and will not be, supported by docker for the ADD and COPY instructions (#docker/1676).

One possible solution is performing a bind mount (using syscall/Mount) of the shared source dir into the job directory.

Concurrent jobs

A shared source directory will cause problems when multiple jobs on the same project are running concurrently.

The solution would be to only start one job if the project is configured to reuse an scm checkout and queue the other requests until it is done. We'll need to add a persistent queuing component (something like Rabbit or lighter if possible) and a polling mechanism in bazooka to schedule queued jobs.

This might seem overkill, but IMHO is something we'll need anyway to be able to limit the number of concurrent builds (for jobs and variants), see #46. It will also be useful for #146, to avoid corrupting the mounted cacheable dirs (.m2, node_modules, etc.)