Open tduffield opened 4 years ago
Just wanted to follow up that I was able to get a POC of this working on my fork. It is super slick.
All CI set CI=true Maybe you don’t need the metatag
@cescoferraro I thought about that, but some folks might have a CI system that works with things as they are, and wouldn't want the new functionality on by default.
Thank you for the proposal! When I first created dobi
I had ambitions to try and make it work as a config for CI, so I think the general idea makes a lot of sense.
There is one thing I don't understand from the proposal. If dobi
sees newer timestamps for the result of a task then it will skip some tasks (Ex: building an image). But how will those images and artifacts be distributed so that they are available to the CI job running dobi
?
If there is a mechanism for transferring files, could that mechanism be used to copy the .dobi
files and avoid the need to read those timestamps from other places?
I am using aws s3 sync
, where it is not possible to preserve timestamps. I chose to inject the LastModified
content into the file itself to ensure that regardless of syncing implementation, you would be able to preserve the correct value.
This is a proposal for functionality that I'm needing to write. The only question is whether or not you would like to accept it in the upstream.
The crux of my issue is that I want to use dobi in my CI environment to build Docker images. I need to optimize for parallelization, so I'm using the
depends
Image config to build a DAG which I can use to run builds in parallel (one task per job). However, there are two issues that I'm running into with running dobi in CI, both of which revolve around the fact that I don't have a persistent host.1) My git clone does not preserve the original modification times, so I always rebuild no matter what. 2) I can't rely on the
.dobi
directory to keep track of my records, so I end up rebuilding dependencies when I shouldn't.My solution to this problem is to expand the existing functionality, which I refer to as
file-mtime
, to include more CI-friendly behavior. Specifically, instead of looking at the mtime of the context directory, I would look at the timestamp of my HEAD commit to determine if any of my files have been changed since I last rebuilt. And instead of relying on the mtime of the record file to determine when I last build my image, I would write theLastBuild
timestamp into the record itself. I would use S3 to store the record files, syncing them to/from my hosts as necessary, but in theory this functionality could also be expanded to support other key-value stores such as DynamoDB, Consul, etc.I figure that this behavior could be controlled by meta parameters like so:
When
hosted
is true, we would use git to determine when a file was last modified, and we would keep theLastModified
value in the image record itself.Thoughts? Questions? Concerns?