Open hedgehog opened 12 years ago
@hedgehog,
No Special Privilege.
One thing I do not want to do is give special consideration to any specific source. For example, there is no special consideration for the Opscode community site. Indeed, any Cheffile that wants to pull cookbooks from the Opscode community site is required to plug in the URL endpoint of the Opscode community site at the top. The API is just 3 routes, and can be implemented with a simple Sinatra application; indeed, it could probably be implemented with some static files behind a web server - and Librarian-Chef will support it.
Likewise, I do not want to give special consideration to any particular GitHub organization. Indeed, I do not want to give special consideration to GitHub as a website. If there were to be a good way to use the cookbooks
organization from GitHub from Librarian-Chef, then it would have to be made generic across multiple git hosting providers. The pertinent questions are: is this possible, is this easy, and what should it look like? How can this be done without treating the cookbooks
organization as special, and indeed without treating GitHub as special? Additionally, the consideration you raised about treating the qa
branch as special for the repositories in the cookbooks
organization is not a generic consideration; how can this be done so that things are generic and no-one has special considerations but so that we can also use the cookbooks
repository very easily?
One specific consideration is that right now Librarian treats the master
branch as the default branch, but does not ask the repository what the correct default branch is: one specific item to be changed is that now we will need to ask what the default branch is, and by default to use that rather than master
.
@hedgehog,
Please don't take the long reply as a "no". It's a "no" for special privileges for any particular source, but it's a "yes" for figuring out a good but general way for making this work. To summarize, the question is: What is the general way to do it that at the same time makes using the qa
branch on the cookbooks
organization repositories as a common source really easy?
Sounds good. I wasn't suggesting special consideration, rather what you suggest, allow for a wider set of sources. "support the cookbooks collection on Github" isn't a plea for exclusivity ;) just the opposite - a request for admission.
How generic the site ...
method ends up will depend on implementation details. site :xyz
is of course meaningless and would have to be an alias to some defaults for a more general interface, it should be trivial to expose that detail in place of the alias :xyz
, e.g as a hash. The implementation may not be trivial, and may need baking time :) I only mentioned the qa
branch since most seem to assume the branch of interest is always master
.
I have some more generic ideas for deployment than just git from github. But will post those in a separate issue.
Thanks for the consideration.
@hedgehog,
The site
source is reserved for HTTP endpoints that look like the Opscode Community Site API. It could be any Sinatra application with the 3 expected routes or it could be static files served by Apache.
The git
source is reserved for Git repositories.
Perhaps a git_site
source?
git_site "https://github.com/cookbooks"
The semantics could be: the actual git URL is #{remote}/#{name}.git
by default. So
git_site => "https://github.com/cookbooks" # set the default source
cookbook "apache2" # use the default source
would translate to a git URL of https://github.com/cookbooks/apache2.git
.
With this example, though, you would still have to specifically set the default branch to qa
on each repository in the GitHub cookbooks
organization.
The site source is reserved for HTTP endpoints that look like the Opscode Community Site API.
Yup, that is what I thought/feared.
Hence the plea to be admitted to that exclusive HTTP club ;)
Ack. I like to dress appropriately, but differently ... i.e HTTP, but not the Opscde API (too limiting)
Doesn't have to be git, in fact the more I mull my ideas the less I lean toward git in the first instance - maybe in a later iteration. I'm holding off on opening an issue describing the proposal simply because the more I contemplate it the more it changes, and simplifies.
What do you find too limiting about the Opscode API?
Maybe unnecessary would have been a better word:
git
protocol for secured access, or for when you need the history, i.e. use the --preserve-git
option mentioned in issue #38http
protocol for public cookbooks, plain HTTP GET, no 3rd party API icing. This is the idea mentioned in the comment:
Store the tagged version of a cookbook (i.e. no git history) as a zipped file on a CDN like the AWS cloudfront.
This way in production settings, if you can reach the web you can get your cookbooks. Not 100% foolproof, but much better than relying on github or whichsever git server is the single point of failure.
Of course if you can't reach any of the CDN edges, very likely no one can reach your site.
Naturally, secure networks will either have their own git server, or an http server they can point to in their Cheffile.
This of course is for the future, much like the single repo idea was before its time ;) That said, I think it is woth bearing in mind
Just some thoughts.
~/.ssh
or deploy keys given directly in the Cheffile (not implemented - could use some help writing tests for deploy keys).cookbooks/
while adding cookbooks/*/.git
to your .gitignore
. You'd still have the cookbooks/*/.git
directories locally (you could have a script to clean them out locally) but they wouldn't propagate to source control or to production.cookbooks/
while gitignoring the embedded git directories, or (b) have a build step which runs on a build server, uses librarian-chef to fetch the cookbooks into cookbooks
, and packages up the results into a tarball, and deploy that built & packaged-up project tarball to nodes, rather than directly deploying the repo source to nodes.Yes, but when they change it we have one more (avoidable) task :) I was thinking even simpler. Just a URI, that Librarian GETs. To dry the code out it may be neat to have a common URI root that a cookbook' ref gets appended to. Different peoples prod would be using different Git tags, so we'll need that. So adopt a convention to eliminate some configuration. Most succinct convention? <vendor>/<cookbook>_<tag>.tgz
. Hence, writing the code you wish you had, the Cheffile entry might look like:
site 'http://cookbooks.io/' # www.cookbooks.io might be a human UI that different vendors can sponsor by picking up the hosting tab?
cookbook 'rbenv', :vendor => 'hw', :tag => 'v0.4a'
cookbook 'rsyslog', :vendor => 'oc', :tag => '1.1'
cookbook 'ruby_build', :git => 'https://github.com/fnichol/chef-ruby_build',
:ref => '05454b507d'
.git
downloaded via HTTP, i.e. the <vendor>/<cookbook>_<tag>.tgz
is a git archive (from memory). If you use site 'git://..
, then the keep/delete /.git
logic kicks in.http://
targets production use cases the-convention-is/we-assume that the devop people have cited the correct vendor and tag combinations that work, so Librarian can a parallel download (fast) of the *.tgz
files, unpack them, then resolve child dependencies. If the tarball is a flat archive (no '.git` history) then grabbing all the HTTP cookbooks in a chef file should not be much longer than the time to download the largest. I think the cookbook metadata should be kept out of the scope of cookbook retrieval. Once the cookbooks have been retrieved then the metadata moves into Librarian's scope. cookbook 'rsyslog', :vendor => 'oc', :tag => '1.1'
and you'll get such cookbooks (fast) from the location site 'http://<uri>'
. You can still point to a git server in the same Cheffile (as above).Agree a cookbook has only one source. No real/cache sources. Apologies for the confusion. Bundler+Chef has been history for some time. Now I've experienced Chef's needs, esp in production: Bundler can't work. Period.
Just to clarify on the motivation for support tag archives over http, in addition to git repositories: chef-client, tag=1.1.0:
In production settings that can make quite a difference, and the differential will increase over time.... even if you do not change the revision you use in production, your git download will continue to increase.
@yfeldblum, I might take a stab at this. Re 2.) above. Can you say how tricky it would be to write a source class that doesn't use the Opscode API? That is what is the minimal JSON you mentioned, and is there an alt. route to providing/setting this data in Librarian?
You can look at the site source. You would want to implement the same public class and instance methods, but with a new implementation for git.
You can also look at the librarian git source vs the librarian-chef git source as an example of separating the abstract part of it from the part of it tied specifically to chef and cokbooks.
To wire it in, you would also need to add a line to lib/librarian/chef/dsl.rb
.
Thanks for the prompt response. I'm not going to tackle git right now. Rather try implement a source using simpler/plain HTTP convention.
I'd looked over the existing site source and the specs. From the specs it seems librarian only requires version
be in the json file, name
being available in the Cheffile. Version info is provided in metadata files. So it seems the pre-existing metadata.rb
could substitute for the json file? However, is it the case that no other part of librarian relies on the json file contents? i.e. what is the role of the cached json?
Sorry for not being clearer.
The chef-site source loads chef to parse the metadata.rb file, if there's a metadata.rb and no metadata.json.
There are many cached json files in the chef-site source, but these are the API responses from the Opscode Cookbooks API and don't have anything per se to do with cookbooks directly.
Would it be possible to support the cookbooks collection on Github? You'd likely be accessing via the Github API, e.g checking availability etc.
I've recently (yesterday) made the first run at tracking repo specific cookbooks (some of heavywater's cookbooks). Hopefully over time cookbooks will evolve into a collection that:
qa
branch)Of course this is a classic chicken-and-the-egg problem: lots of people won't use this collection until it is easy to use, the collection will be easy to use when bugs are ironed out by lots of people using them.
Librraian support for
site :cookbooks
would at least make it easy for people to consider using. If this is possble I'd remark that the branchqa
should be considered the source branch. Themaster
branch being reserved for upstream.Thoughts?