mdsol / dice_bag

DiceBag is a library of rake tasks for configuring web apps in the style of The Twelve-Factor App.
MIT License
19 stars 4 forks source link

Exclude vendor/cache directory by default #67

Closed benton closed 10 years ago

benton commented 10 years ago

12-factor apps should bundle all their requirements - this includes all source code for all dependent gems. This practice insulates the project from "orphaned" public repos that may disappear (as happened to Balance just last week), and also allows apps to be deployed Heroku-style, without a private SSH key present. The best way to achieve this with Ruby apps is the bundler gem's package command:

bundle package

which copies all the gems into vendor/cache. This directory should then be committed to the source repo, so that a checkout of the top-level repo is all that is required to build and release the app. (Subsequent commands to bundler will recognize this location)

Unfortunately, a side-effect of this practice is that Dicebag is (re-)rendering all the *.dice files in the bundled gems as well, to potentially dangerous effect. Here's a sample rake release from mauth:

File '/app/vendor/cache/dice_bag-5efbc98d0747/lib/dice_bag/templates/google_analytics.yml' created
File '/app/vendor/cache/dice_bag-5efbc98d0747/lib/dice_bag/templates/aws.yml' created
File '/app/vendor/cache/dice_bag-5efbc98d0747/lib/dice_bag/templates/dalli.yml' created
File '/app/vendor/cache/dice_bag-5efbc98d0747/lib/dice_bag/templates/database.yml' created
File '/app/vendor/cache/dice_bag-5efbc98d0747/lib/dice_bag/templates/newrelic.yml' created
File '/app/config/privileged_uuids.yml' created
File '/app/config/app_uuid.yml' created
File '/app/config/dalli.yml' created
File '/app/config/newrelic.yml' created
File '/app/config/amazon_s3.yml' created

As you can see, it's the templates from Dicebag itself that are being overwritten, so no harm is caused to the operation of mAuth in this case. But if libraries (like Eureka-client or mAuth-client) use Dicebag for their own purposes, and start to render their own YML files for inclusion into their codebase, then these will be re-written on release, in an environment unsuited for them.

In my view, it would be worth adding an exception to Dicebag's normal recursive search to respect this behavior, and ignore the vendor/cache directory (perhaps all of vendor?). What say you?

(P. S. I believe this ancient wisdom holds: http://ryan.mcgeary.org/2011/02/09/vendor-everything-still-applies/)

fosdev commented 10 years ago

@benton Seems like a reasonable proposal to me from a be a bit paranoid standpoint. However, I don't think you would ever want define a .yml file use dice_bag in the ../templates directory of these libraries. Effectively there is just a file.ext.dice file there and so dice_bag renders this, which I suspect would only be cruft as long as they did not error in the environment they are being rendered in. But that would just mean they were not well designed default .dice templates.

@asmith-mdsol Am I missing something here. Seems the only problem here is cruft or errors due to brittle .dice templates.

benton commented 10 years ago

I agree that included libraries are unlikely to depend on YML files generated from their own .dice templates, but I am worried about .dice templates that don't provide defaults. I guess any such issues will show up when rake config is run with the packaged bundle in place -- hopefully before a real deployment.

I also understand if you want to keep bundler-specific logic out of Dicebag (who knows when bundler will change, right?). Therefore, if you're confident that runtime errors are extremely unlikely, and that the problem of lazy .dice templates is better caught early anyway, we can ignore the whole (potential) problem for now.

benton commented 10 years ago

This is no longer an academic concern. Here's a concrete example of the nested template problem, which I just encountered when trying to create a 12-factor version of Eureka:

╭─benton@hume ~/projects/medidata/eureka ‹1.9.2-p290› ‹feature/12factor*› 
╰─➤  bundle exec rake config:deploy                                                                                  1 ↵
rake aborted!
no ENTITY_STORE defined in production
(erb):13:in `create_file'
Tasks: TOP => config:deploy
(See full trace by running task with --trace)
╭─benton@hume ~/projects/medidata/eureka ‹1.9.2-p290› ‹feature/12factor*› 
╰─➤  grep -r ENTITY_STORE .                                                                               1 ↵
vendor/cache/eureka-client-698ee7e5eebe/lib/euresource/dice_bag/euresource.rb.dice:  (environment == 'production' ? raise("no ENTITY_STORE defined in #{environment}") : 'file:cache/rack/body') %>

Eureka embeds eureka-client, which insists that an $ENTITY_STORE be defined. Presumably this is not a required environment variable for Eureka itself, but perhaps I'm mistaken.

I think Dicebag needs a solution for this, even if it's just an optional --exclude [dir] command-line parameter to the config tasks.

fosdev commented 10 years ago

Well, that touches on what I was saying that dice_bag templates may be brittle and should be written to gracefully handle/log issues vs. raise or whatever. But that is wishful thinking that they are tested for unspecified values, etc.

However, I think dice_bag should be able to recognize a .dice template is in the library location for its template and not try to populate it transparently without needing the --exclude switch.

On Tue, Mar 25, 2014 at 1:18 PM, Benton Roberts notifications@github.comwrote:

This is no longer an academic concern. Here's a concrete example of the nested template problem, which I just encountered when trying to create a 12-factor version of Eureka:

╭─benton@hume ~/projects/medidata/eureka ‹1.9.2-p290› ‹feature/12factor_› ╰─➤ bundle exec rake config:deploy 1 ↵ rake aborted! no ENTITY_STORE defined in production (erb):13:in `createfile' Tasks: TOP => config:deploy (See full trace by running task with --trace) ╭─benton@hume ~/projects/medidata/eureka ‹1.9.2-p290› ‹feature/12factor› ╰─➤ grep -r ENTITY_STORE . 1 ↵ vendor/cache/eureka-client-698ee7e5eebe/lib/euresource/dice_bag/euresource.rb.dice: (environment == 'production' ? raise("no ENTITY_STORE defined in #{environment}") : 'file:cache/rack/body') %>

Eureka embeds eureka-client, which insists that an $ENTITY_STORE be defined. Presumably this is not a required environment variable for Eureka itself, but perhaps I'm mistaken.

I think Dicebag needs a solution for this, even if it's just an optional --exclude [dir] command-line parameter to the config tasks.

— Reply to this email directly or view it on GitHubhttps://github.com/mdsol/dice_bag/issues/67#issuecomment-38615115 .

benton commented 10 years ago

Completely agree both that

  1. Dicebag templates should provide some default for each setting; and
  2. Sadly, Dicebag can probably not count on this being true.

I guess that we can avoid putting bundler-specific logic into Dicebag by checking whether the template in question is currently somewhere within the ruby $LOAD_PATH, and ignoring it in that case. If you agree, please consider this Issue a feature request! :)

harryw commented 10 years ago

I think the --exclude option sounds eminently sensible.

harryw commented 10 years ago

@asmith-mdsol if we were to add this --exclude option (with test coverage, of course) and make a PR, would you be likely to accept it? I don't want to do that speculatively without confirmation on the design.

Adding default exclusions would be a bigger change since it's not backwards-compatible. But we could go that route if you prefer. There might be some way to automatically discover the bundler cache path when running in a bundler context.

benton commented 10 years ago

FYI, I'm currently adding the following line to the Dockerfiles of both Eureka and mAudit. It truncates all the relevant templates to zero-length files:

# Workaround for https://github.com/mdsol/dice_bag/issues/67
RUN bash -c "find vendor/cache -type f -iname \*.dice -exec cp /dev/null {} \;"
asmith-mdsol commented 10 years ago

Addressing things in reverse order, yes @harryw I'd welcome a PR to add an --exclude option or (frankly) any other pragmatic way of dealing with this issue.

Given DiceBag is somewhat opinionated (well, I am at least) and one of those opinions is that Bundler is a Good Thing, I wouldn't even reject a simple change to exclude vendor/cache by default; we're pre-1.0 at this point so no violation of SemVer to worry about. (That said, as soon as this lands I'm likely to use it as an excuse to go to 1.0 as it's long past time.)

Another thought would be to provide an --exclude-bundler-package (or similar) convenience option that would abstract away the actual path.

However, I think it might be best of all to use the heuristic of spotting a Gemfile.lock and then excluding vendor/cache automatically. This would seem to prevent unexpected behaviour (to me at least) and Just Work for most people. Anyone who's in a bizarre situation can raise an issue and we'll add an --include-bundler-package option. So, that's my preference. However, I'm not volunteering to fix this, so if --exclude floats your boat, I'll accept it with gratitude!

On the topic of failing templates, I think that's the right of the template author to stop configuration (and hence deployment) so I wouldn't advocate a change there (although a friendlier error message would be nice).

(Finally, thanks @benton for the link to the blog post, completely agree with it. I have an interesting idea how to separate out the vendor/cache commits from "mainline" commits on (say) develop if you're interested to here about it but there's not enough room in this margin to show my working. :wink:)

harryw commented 10 years ago

A little experimentation with my local Bundler (v1.5.3) has shown that Bundler.settings["path"] yields the path Bundler's using for gems if one is set (via --path or --deployment), or nil otherwise. So we should be able to exclude that automatically without too much trouble.

harryw commented 10 years ago

This issue should be resolved in PR #71.

benton commented 10 years ago

develop branch is working for me now, so I'll close this issue. But I have several 12F apps that just pull the public version. What's the process for release on this product?