dnephin / dobi

A build automation tool for Docker applications
https://dnephin.github.io/dobi/
Apache License 2.0
309 stars 36 forks source link

Proposal: CI friendly tracking #182

Open tduffield opened 4 years ago

tduffield commented 4 years ago

This is a proposal for functionality that I'm needing to write. The only question is whether or not you would like to accept it in the upstream.

The crux of my issue is that I want to use dobi in my CI environment to build Docker images. I need to optimize for parallelization, so I'm using the depends Image config to build a DAG which I can use to run builds in parallel (one task per job). However, there are two issues that I'm running into with running dobi in CI, both of which revolve around the fact that I don't have a persistent host.

1) My git clone does not preserve the original modification times, so I always rebuild no matter what. 2) I can't rely on the .dobi directory to keep track of my records, so I end up rebuilding dependencies when I shouldn't.

My solution to this problem is to expand the existing functionality, which I refer to as file-mtime, to include more CI-friendly behavior. Specifically, instead of looking at the mtime of the context directory, I would look at the timestamp of my HEAD commit to determine if any of my files have been changed since I last rebuilt. And instead of relying on the mtime of the record file to determine when I last build my image, I would write the LastBuild timestamp into the record itself. I would use S3 to store the record files, syncing them to/from my hosts as necessary, but in theory this functionality could also be expanded to support other key-value stores such as DynamoDB, Consul, etc.

I figure that this behavior could be controlled by meta parameters like so:

meta:
    project: hosted-runtime
    hosted: true # can be controlled via ENV['DOBI_HOSTED']

When hosted is true, we would use git to determine when a file was last modified, and we would keep the LastModified value in the image record itself.

Thoughts? Questions? Concerns?

tduffield commented 4 years ago

Just wanted to follow up that I was able to get a POC of this working on my fork. It is super slick.

cescoferraro commented 4 years ago

All CI set CI=true Maybe you don’t need the metatag

tduffield commented 4 years ago

@cescoferraro I thought about that, but some folks might have a CI system that works with things as they are, and wouldn't want the new functionality on by default.

dnephin commented 4 years ago

Thank you for the proposal! When I first created dobi I had ambitions to try and make it work as a config for CI, so I think the general idea makes a lot of sense.

There is one thing I don't understand from the proposal. If dobi sees newer timestamps for the result of a task then it will skip some tasks (Ex: building an image). But how will those images and artifacts be distributed so that they are available to the CI job running dobi ? If there is a mechanism for transferring files, could that mechanism be used to copy the .dobi files and avoid the need to read those timestamps from other places?

tduffield commented 4 years ago

I am using aws s3 sync, where it is not possible to preserve timestamps. I chose to inject the LastModified content into the file itself to ensure that regardless of syncing implementation, you would be able to preserve the correct value.