Support requiring Gemfiles inside of Gemfiles (i.e. Gemfileception)

dtognazzini commented 9 years ago

Background

Gems, by themselves, only declare dependencies; they do not inform how to satisfy those dependencies. Bundler declares gem dependencies for package management and also specifies how to satisfy the dependencies (e.g. source, path, gem(git:), etc.)

It would be nice to reuse Gemfile declarations (both dependency specification and satisfaction) in other Gemfiles without copying and pasting.

Proposal

Provide a require_gemfile method in Bundler::DSL that takes a path to another Gemfile that can hold common dependency resolution declarations. For example:

<root>/Gemfile

require_gemfile "common/Gemfile"
gem "some_gem"

<root>/common/Gemfile

source "https://my_fav_gemserver.net"
path "gems" do
  gem "my_fav_gem"
end

The above would be identical to the following single file:

<root>/Gemfile

source "https://my_fav_gemserver.net"
path "common/gems" do
  gem "my_fav_gem"
end
gem "some_gem"

Notice: the resulting Gemfile nests the relative paths of the required Gemfile (common/Gemfile) under the relative path from the requiring Gemfile (common/).

Use Case 1: Sharing common gems

This is very similar to bundler/bundler#3102, which uses eval_gemfile. In addition to sharing dependency declaration, require_gemfile allows sharing all supported DSL methods.

What require_gemfile offers over eval_gemfile is that it rebases relative paths from the required Gemfile to relative paths from the requiring Gemfile.

Use Case 2: Sharing source code, repository local gem paths

It's become a common pattern amongst Rails projects to decompose the system into an ecosystem of inline gems containing Rails Engines. The main Rails Application pulls in the inline gems by declaring dependencies in the application-level Gemfile. For example:

_<systemroot>/Gemfile

path "gems" do
  gem "subsystem_one"
  gem "subsystem_two"
  gem "subsystem_three"
end

If subsystem_one grows crazily, it could be further decomposed into many gems, perhaps:

_<system_root>/subsystem_one/subsystemone.gemspec

 s.add_runtime_dependency("subsystem_one-authorization")
 s.add_runtime_dependency("subsystem_one-billing")
 s.add_runtime_dependency("subsystem_one-analytics")

_<system_root>/subsystemone/Gemfile

# path "gems" here would be <system_root>/subsystem_one/gems
path "gems" do
  gem "subsystem_one-authorization"
  gem "subsystem_one-billing"
  gem "subsystem_one-analytics"
end

To support this decomposition, the main application's Gemfile would have to be updated:

_<systemroot>/Gemfile

# Declare dependencies on subsystems
path "gems" do
  gem "subsystem_one"
  gem "subsystem_two"
  gem "subsystem_three"
end

# Add the path to find gems required by subsystem_one
path "subsystem_one/gems"

Updating the top level Gemfile for subsystems undergoing structural refactors like this is somewhat painful but reasonably maintainable provided there is only one, conventional "gems" repository for each layer.

When it becomes unmaintainable

This strategy becomes unmaintainable with one more level of layering.

For example, consider a super_system project that pulls in the root of the above project (named here as system_1) and the root of another project (system_2):

_<super_system>/systems/system1.gemspec

 s.add_runtime_dependency("subsystem_one")
 s.add_runtime_dependency("subsystem_two")
 s.add_runtime_dependency("subsystem_three")

_<supersystem>/Gemfile

path "systems" do
  gem "system_1"
  gem "system_2"
end

The super_system project would have to include paths in its Gemfile for the gems required by system_1:

_<supersystem>/Gemfile


# Declare dependencies on the systems
path "systems" do
  gem "system_1"
  gem "system_2"
end

# Add paths to find gems required by system_1 and its subsystems
path "systems/system_1/gems"
path "systems/system_1/gems/subsystem_one/gems"

The last line couples super_system to the internal structure of system_1.

Contrived?

This example may seem a bit contrived, but I work on something similar to this whereby a system is comprised of many Rails Applications. Each Rails Application supplies its own inline gem that the system depends on in its Gemfile. The inline gems provided by the Rails Applications are used by the system to interact with the applications. In the usage above, every restructuring internal to system_1 would need to be handled by the Gemfile for super_system.

With require_gemfile, using a convention whereby systems specify their path repositories in a Gempaths file, the above would be simpler:

_<supersystem>/Gemfile

# Reuse gem paths defined by system_1
require_gemfile "systems/system_1/Gempaths"

path "systems" do
  gem "system_1"
  gem "system_2"
end

_<super_system>/systems/system1/Gemfile

# Use gem paths defined for system_1
require_gemfile "Gempaths"

_<super_system>/systems/system1/Gempaths

# Reuse gem paths defined by the subsystems
require_gemfile "gems/subsystem_one/Gempaths"
require_gemfile "gems/subsystem_two/Gempaths"
require_gemfile "gems/subsystem_three/Gempaths"

_<super_system>/systems/system_1/gems/subsystemone/Gemfile

# Use gem paths defined for subsystem_one
require_gemfile "Gempaths"

_<super_system>/systems/system_1/gems/subsystemone/Gempaths

path "gems"

With the above, no layer knows about the internals of the layer beneath it.

ccutrer commented 9 years ago

I should mention that in a way we already monkey-patch bundler from our Gemfile to support a separate lockfile - if there is a specific environment variable to enable (experimental) support for Rails 4 in our app, it uses Gemfile.lock4. This allows us to both switch between "normal" and Rails 4 mode very quickly during development (not having to bundle update between them, or play some other trick swapping out lockfiles), but also when slowly rolling to production, we have a single release with both lockfiles, and we can A/B test them, or slow roll. Probably not something recommended for everyday use of bundler, though.

ccutrer commented 9 years ago

@johnnyshields it would slightly alleviate one of our use cases (i.e. using a Gemfile fragment in addition to a gemspec in order to specify the source of a dependent, non-released gem; which is really not a big deal). It does not solve maintaining a massive Gemfile by breaking it into multiple fragments, or managing a lockfile that may or may not have additional proprietary dependencies.

johnnyshields commented 9 years ago

IMHO the reason for the massive Gemfile is that git gem dependencies can't be specified at a per-gem level.

I don't see the point of breaking a large Gemfile into a lot of smaller Gemfiles. In theory you should be breaking your project into lots of smaller gems, each with it's own gemspec. Breaking up the Gemfile alone does nothing to reduce the complexity of your project codebase, your tests take the same amount of time to run, etc.--no real benefit other than purely cosmetic because you prefer to read N files of size M rather than 1 file of size N*M

johnnyshields commented 9 years ago

For the lockfile, for the sake of argument let's assume Bundler knows how to read the :git and :file references from the gemspecs and builds an appropriate lock-file considering those proprietary dependencies.

ccutrer commented 9 years ago

This is our "Gemfile", sans anything needed as dependencies of private gems: https://github.com/instructure/canvas-lms/tree/stable/Gemfile.d. While it would be trimmed down if we weren't explicitly mentioning dependencies-of-dependencies (for locking purposes), it would still be large and unwieldy. Actually, I should probably break rails3 vs. rails4 dependencies into their own files. Also notice that we do indeed have a multitude of local gems (under the gems folder, and each mentioned by name in the other_stuff.rb part of Gemfile.d).

So, in a nutshell, no, our huge Gemfile (and wanting to break it up) has nothing to do with gem dependencies needing to specify their own dependencies as coming from git source, but a combination of it flat out being a massive project (historically monolithic, but slowly being broken up), and not properly using a lockfile (due to the proprietary plugin problem).

johnnyshields commented 9 years ago

OK it looks like you're using some conditional build logic depending on what the user wants, i.e. "postgres on Rails 4" would build one output Gemfile vs. "mysql on Rails 3" -- something like that?

ccutrer commented 9 years ago

Kind of. More like "is there stuff in gems/plugins? pull in all the gems there". so non-open sourced stuff just sits there, and you don't have to do anything else - the "master" Gemfile (more specifically, Gemfile.d/plugins.rb) dynamically pulls in the additional gems (or Gemfile.d/~after.rb for legacy plugins in vendor/plugins that aren't gems, but have a Gemfile to declare their dependencies. also support Gemfile fragments from plugins in gems/plugins is in that file, for them to declare the source for non-released gems). Having two layers of dependencies on local gems is actually quite rare, and even in that case is handled by them being siblings, and the master Gemfile automatically pulling in all the local gems.

johnnyshields commented 9 years ago

Just for reference, Node.js NPM actually allows you to load two different versions of the same package (=~ gem) at the same time if there are conflicting dependencies, due to the way it's module system works. Rubygems can't do this because it's gems pollute the global namespace with their constants, e.g. attempting to load ActiveRecord 4.0 would overwrite ActiveRecord 3.0 in memory if you've already loaded it. This sort of thing could be emulated in Ruby but is not the standard, see here for more detailed discussion: http://andrew.ghost.io/emulating-node-js-modules-in-ruby/

dtognazzini commented 9 years ago

It's been awhile since I've looked at this and it's quite difficult even for me to distill my exact use cases.

To summarize, in a single Rails project comprised of internal, project-local gems (or Rails Engines), I would use this feature for reusing the following specifications across Gemfiles used by the test environments for each project-local gem:

path specifications to ensure all test environments can find other project-local gems.
source specifications to ensure all test environments use the same version for gems external to the project.
group specifications to share commonly used gems in the Rails' test environment.

All of these uses are policy decisions controlled by the project. If/when a project-local gem finds usefulness independent of the project, I would carve out the gem into a new repository and serve via a gem server.

This discussion explores one strategy of accomplishing this with require_gemfile in more detail.

Still I wonder about this:

Perhaps a different implementation whereby you could use plain-old require in a Gemfile would be sufficient. Rake supports this by extending the main object with the Rake DSL. Whereas Bundler instance_evals the Gemfile. I wonder if it would be possible to rework some of the internals to allow for using plain-old require.

In that world a path specification would be relative to the file it's used in.

bbozo commented 9 years ago

Hello :smiley:

we have a project dealing with 2 rails 4 apps hanging on the same database, a jruby project handling a mobile API and a C ruby CMS of sorts.

To be able to manage this we have extracted the common code into a rails engine (mostly models, some libs) and one day the API server started generating hashes for paperclip in a different way from the CMS server which with S3 means that S3 assets can't be accessed - and after much digging I figure that it's indirectly a freaky dependency issue - an easy mistake to make when you have have 2 Gemfile and 2 Gemfile.lock which share pretty much everything except simple form and choice of rails server.

So I set out to fix the problem once and for all so it doesn't repeat, and OK, I know that you can put the common gems into the gemspec of the engine so I did that. However when I tried that, I realized that those gems don't have git support, local path support and that they don't get required on their own, so it's a bit of a no go, or at least in many ways a step down from what we have now.

Then I google for "how to handle common gem dependencies with engine" and this was the first link, and only link

I was very happy, now I'll to go eat chocolate and think options :disappointed:

johnnyshields commented 9 years ago

@bbozo, yes exactly. Welcome to my world. The current Gemfile / gemspec paradigm does not scale well with complex apps, and a large part of the problem is how github dependencies are handled. As the Ruby community seems to care little about this problem, I think the best answer is:

build a time machine
go back in time to when you were deciding which framework/language to write your app(s) with
use node.js

bbozo commented 9 years ago

Ι'm starting to get the feeling that bundler source code has gone a bit amok,

basically there's a long line of closed or "low priority" issues that hold back the "split complex rails app into gems/engines" mantra that the Rails team has been pushing a couple years now.

For example:

65 - this issue prevents keeping gem versions in sync among apps using a common engine
https://github.com/bundler/bundler/issues/3571 - :path is not supported with "package" - this means that any deployment mechanism depending on bundle package can't work with local gems
https://github.com/bundler/bundler/issues/2016 - workaround for last issue is to use git push on the engine and bundle update on the rails apps using it, which is irritating but OK - however when you go this route then every time you want to push a 1-liner code change bundler will try to update ALL of rails dependencies because an engine gem depends on rails. I tried removing rails from the engine gemspec, but it was no use, all of rails dependencies still got pulled

Unfortunately @indirect has been very defensive about all of these issues, according to him:

using :git is a hack, you should publish private engines to rubygems.com
using :path means "you handle packaging"
not updating rails engine dependencies when you want to patch a piece of code from the engine is simply hard and not going to happen

Unfortunately, one gets a feeling that this defensiveness comes from the possible fact that bundler has become to unwieldy to follow Rails development goals and emerging deployment standards,

I get this feeling because in this issue @indirect said that 3000$ isn't enough money for the trouble of developing a feature that should be a priority for the rails community anyway - the ability to handle large projects - so I'm guessing Rails has come to the end of the line where Bundler is concerned and that the rift will continue to grow.

Also, I get a bad feeling from the fact that this issue is the only one to which so far I have seen @indirect help someone circumvent a Bundler design problem, and this was only after he confirmed that @johnnyshields would pay 500$ for a hackish solution.

In fact he says it clearly:

https://github.com/bundler/bundler-features/issues/65#issuecomment-61932167 (To be super clear to anyone reading this ticket later: the above is a clever but dirty hack, it could break in future versions of Bundler, and the Bundler team only provides help with things like this if you provide monetary compensation.)

All of this leaves a very bad taste in my mouth because I've personally lost at least 2 weeks of my life handling these issues, and some of these discussions read as an Oracle support worker trying to explain to his clients that what they're seeing is an intended feature and not a serious design issue that needs addressing, and any help will cost ya 500$

Not what I've come to expect from OS community....

Anyway, my 2 cents

bbozo commented 9 years ago

Rails core topic: https://groups.google.com/forum/#!topic/rubyonrails-core/Q0yj6XJEaew

segiddins commented 9 years ago

@bbozo I'd ask you to please be polite to André. He has spent thousands of hours on bundler, and he, I, and the rest of the bundler team care greatly about making bundler the best tool it can be.

As for this thread, I think it's gone on long enough. The official position of the bundler team is to reccomend people to publish their gems to their own gem servers, and that reccomendation is unlikely to change without a drastic change in how rubygems or bundler work.

As always, pull requests to bundler are very welcome to improve things you find to be rough edges, but please do keep in mind that we're the team stuck maintaining everything in the long run, and to that end we've made the decision that this feature is not one we wish to officially support. Thanks everyone for chiming in!

johnnyshields commented 9 years ago

The big problems with problems in my view:

1) Gemspec and gemfile are separate. Gemfile allows using :github/:path, the gemspec does not. Gemfile is for "projects", gemspec is for "gems".

In NPM (node.js) there is a single file called "package.json" which handles both responsibilities in one schema.

2) Ruby gems pollute the global namespace. If I bundle two gems that define the namespace Foobar, one will monkey patch the other.

Javascript (ES6) has a module system where each file must import and export. Thus I can do something like import Foobar from FoobarGem1 as Foobar and import Foobar from FoobarGem2 as Bazqux to avoid kludges.
Further point: when NPM gets dependencies, it gets a tree of dependencies specific to each library. Hence you never get into the situation where two gems (A and B) with a common third dependency have a clash (A-->C >= 1.0 and B-->C < 1.0)--each gem gets the correct version of it's own dependencies.

Both of these are major obstacles to scaling Ruby in production and are costing me real money in wasted effort. Frankly speaking I much prefer coding in Ruby than Javascript, and Rails is a great framework. But the JS community has it's shit together, the Ruby community does not. Not to disparage the work Bundler and Andre have done, but Bundler has fundamental design flaws some of which stem from design flaws in Ruby (lack of ES6 or Python-like module system).

(Copying to Ruby Core thread)

bbozo commented 9 years ago

I appreciate your good work here, but some of the support practices seem to have went a bit off (500$ for a hack that was already implemented by the canvas-lms team https://github.com/instructure/canvas-lms/commits/stable/Gemfile.d/sqlite.rb) and also if there's a problem with code maintenance that can't be fixed - it'd be good to let people know so they can plan accordingly with new projects

Also, knowing the cost of the tech support would be good to know before making a Bundler.require in a new project

@segiddins @indirect please let me know if I can updates parts of my post to not be impolite while still reflecting reality, it is not my intent to insult :heart:

johnnyshields commented 9 years ago

@indirect has done a fantastic job solving the bundler problem under certain assumptions (global namespaces, separation of Gemspec vs. gemfile). If we remove those assumptions however (as NPM has done) a much more scalable solution becomes possible--imagine no more gem conflicts!

While this would certainly be huge undertaking, it is possible, and it could be done in a backwards compatible fashion. It's worth noting that Javascript has achieved such a transition, though it's taken several iterations. For example, there were a variety of competing attempts to solve the module (import/export) system via libs--RequireJS, AMD, etc--before it was ultimately incorporated at the language level in EcmaScript 6.

johnnyshields commented 9 years ago

@brixen interested to hear your thoughts on this.

bf4 commented 9 years ago

@johnnyshields @bbozo @dtognazzini There's a lot of words here, and I may be misunderstanding your use case, but you might want to try this in your Gemfile so you can bundle a gem from git or from a local path: http://www.benjaminfleischer.com/2011/11/01/bundler-ruby-gem-development-tricks/

if ENV['APP_GEMS_DIR']
  # for local gem development, use local gem directory so gem changes don't need to be pushed to test
  gem 'my_gem', :path => "#{ENV['APP_GEMS_DIR']}/my_gem", :require => "namespace/my_gem"
else
  gem 'my_gem', :require => "namespace/my_gem", :git => path_to_git_repo, :tag => tagname
end

Rubygems isn't going to change to package and install gems (which is what the path/git options in bundler do. They do the job of gem build, in addition to gem install).

Whenever I've worked on code that started to get crazy with interactions between a gemspec and a Gemfile, I wanted to change the app architecture, not the tools. :)

If you think about it a gemspec's job is to declare dependencies for a library. A Gemfile's job is to declare dependencies for an app.

Also maybe see https://groups.google.com/forum/#!topic/ruby-bundler/G35fht6T3yA/

johnnyshields commented 9 years ago

@bf4 please read https://github.com/bundler/bundler-features/issues/65#issuecomment-120257396 carefully.

The problem is not "how to kludge together multiple Gemfiles", and cannot be fixed by a bandaid. There are fundamental design assumptions in Bundler / Rubygems that make scaling Rails beyond "weekend hobby" apps difficult. The more gems in your Gemfile, the more you feel the pain (I'm at 200 and counting)

bf4 commented 9 years ago

@johnnyshields It's hard to say without seeing your code, but statements like scaling Rails beyond "weekend hobby" apps difficult. and Ruby gems pollute the global namespace. make it sound like you're missing something. Plenty of Rails app scale beyond weekend hobby. And the notion of unintentionally re-opening/re-defining classes in Ruby is one of design, rather than a language issue. In the old days, Rake had a module called Task. So, a Rails app with a Task model would interact with Rake in weird ways. Nowadays, the module in Rake would be Rake::Task. That is, good library conventions put the whole library under a namespace. That's just how it is.

I'm going to stop responding since I'm pretty certain what you want isn't possible without changing the nature of Ruby and rewriting much of its ecosystem.

johnnyshields commented 9 years ago

@bf4 have you looked at NPM and ES6 modules?

johnnyshields commented 9 years ago

In the old days, Rake had a module called Task. So, a Rails app with a Task model would interact with Rake in weird ways.

That's exactly what I mean by "polluting the global namespace." The problem is not merely name conflicts (which are a big pain), but even more so because you can only have a single version of a given gem living in your app ecosystem at once. Please study NPM / ES6, which once had the same limitations as Ruby, but has overcome them because the JS community is not content to say "That's just how it is."

indirect commented 9 years ago

@bbozo Bundler is both free and open source software. If you don't like how it works, you are welcome to take the source code and make it work the way you want it to instead. Many, many other companies have been able to make Rails engines scale for them without needing this feature. It might be more productive for you to ask them how they did it, rather than demand hundreds or thousands of hours of free software development.

@johnnyshields Bundler is only able to work within the constraints of Ruby and RubyGems. If you want changes to RubyGems, ask the RubyGems team, not the Bundler team. If you want changes to Ruby, ask the Ruby team. If you want to be using NPM instead, please use Node.

Unfortunately, the discussion in this issue has moved beyond a feature request for Bundler, and has turned into demands that gemspecs and even the nature of constants in Ruby itself should be changed. We appreciate the feedback, but we don't accept feature requests for either RubyGems or Ruby here, so I'm going to lock this ticket.

The Bundler team values feedback, a lot, and we are happy to discuss and work with anyone having problems to try to help them find a solution. Personal attacks and demands for hundreds or thousands of hours of development work are both counter-productive if what you want is a solution to your problem, and a good way to get threads locked and possibly even get yourself banned from the Bundler issue tracker.

rubygems / bundler-features