Open kasimon opened 5 years ago
Check out https://github.com/xorpaul/g10k/releases/tag/v0.7.1
Now I'm saving the hash sum of the Puppetfile inside a file .g10k-deploy.json
{
"name": "benchmark",
"signature": "c140b0800399550395eb9640bb9cf227956abd01",
"started_at": "2019-08-27T17:39:25.837150578+02:00",
"finished_at": "2019-08-27T17:39:32.908326876+02:00",
"deploy_success": true,
"puppetfile_checksum": "b0de4967264974eeb2b3381e90c2a4aefb91f5f2b4cbb70eb863d5fbbcfd6e1c"
}
and then check if the Puppetfile
in the environment has changed. If not then g10k outputs this:
Skipping Puppetfile sync of branch example_benchmark because /tmp/example/example_benchmark/Puppetfile did not change
Please try it out.
I think I have to remove this feature, because after thinking about it a bit I noticed the following problem.
What if you have a module inside your Puppetfile, which is simply tracking a branch and a commit to that branch was done?
The Puppetfile itself hasn't changed from the previous g10k run, so g10k wouldn't even bother to check if the latest commit to the tracked branch is still the same and never pick up the committed changes.
The only thing I can think of to speed up your 500 git modules inside your Puppetfile is to check when a commit hash has been specified for a module if this exact commit hash is already deployed before even trying to check for updates to that module's git repository.
But this would mean that you would've to track and specify the commit hash in the Puppetfile.
While I can see a point for that in situations where someone is tracking the latest version of a module, in cases where exact versions or commits are pinned in the Puppetfile making this check optional would result in a huge performance increase.
I already mentioned that problem :)
The only thing I can think of to speed up your 500 git modules inside your Puppetfile is to check when a commit hash has been specified for a module if this exact commit hash is already deployed before even trying to check for updates to that module's git repository.
Yes, that would be possible. Actually in the meantime we changed our setup even further and now build each environment only once into a tarfile named after the hash of its newest commit in the our puppet repo and then ship these tarballs to our compile master. This has the additional advantage that if two environments are on the exact same code version their code is built only once. Also this allows us to move the g10k (now only for puppetfile) and puppet parser generate calls off of the compile masters, reducing their load even more. And this setup shares the same problem that we must make sure that the Puppetfile contains no 'moving' targets, but that policy is okay for us.
So we currently don't need that feature anymore, but I still think some intelligent performance optimization would be a good idea.
FWIW, we recently added an --incremental
flag to r10k, which will load the Puppetfile in an existing environment if it exists, then sync the environment, and when loading the updated Puppetfile see if a) a module's version "floats", or b) if it is "static" but has changed between the two Puppetfile versions. We then only sync the modules that pass either of those tests.
If you implement similar functionality, it might be good to share the terminology of that kind of deploy being "incremental"?
@justinstoller Thanks for thinking of me/g10k :smiley:
What do you mean by a module version floats or is static?
Is static
simply a pinned version, like 1.0.1
and floats
something like latest
(What about a tag or branch?)
Can you post a link to the r10k code changes? maybe that's easier :smirk:
You got the idea! A static version is any explicit version given to a forge module (not :latest
or left off) and for git modules it's declarations that specify :commit
or :tag
, or :ref
(but only if the value given to :ref
matches a 40 character sha).
The code is implemented on each module type class like R10K::Module::Git#L20-L28 but I don't know how helpful that is. Partly because Reid's been working on a new yaml specification format that gives every module an explicit type and version (so we have to check for type
in that code and then treat version
the same as we otherwise treat a ref
).
The test code may be more helpful since it gives examples: (see spec/unit/module/git_spec.rb#L14-L39 for git and spec/unit/module_loader/puppetfile_spec.rb#L357-L386 for testing resolving behavior against this puppetfile spec/fixtures/unit/puppetfile/various-modules/Puppetfile).
Hope that helps!
Hi Adrian,
bear with me, yet another wishlist issue: We always have dozens of environments with lots of modules and one thing that takes up a lot of time is g10k scanning every module's git repo for updates even in case the Puppetfile hasn't changed since the last check out. While I can see a point for that in situations where someone is tracking the latest version of a module, in cases where exact versions or commits are pinned in the Puppetfile making this check optional would result in a huge performance increase. Currently g10k runs close to a minute checking more than 500 repos on every run, where running
git ls-remote
against the puppet repo(s) should only take seconds.BR Karsten
PS: I just pinged you on xing, in case you want to connect there :)