postmodern / chruby

Changes the current Ruby
MIT License
2.85k stars 190 forks source link

Use RUBY_ENGINE_VERSION to decide the GEM_HOME #410

Closed eregon closed 4 years ago

eregon commented 5 years ago

cc @postmodern @havenwood

eregon commented 5 years ago

I think this is also safer for JRuby. JRuby doesn't have C extensions but it has JRuby extensions which access JRuby internals that could change between JRuby releases with the same RUBY_VERSION and cause different javac compilation of JRuby extensions, leading to confusing require-time errors. While probably not common, it's certainly a risk and I think it's better for every Ruby installed to have its own GEM_HOME, rather than having this safety only for MRI.

Pure-ruby gems could be shared arbitrarily between all installed Rubies, I think that would be a way to share many gems safely if desired.

postmodern commented 5 years ago

I like this feature as it returns chruby to it's original behavior back when the patch-level was included in the version string. However, changing GEM_HOME to use RUBY_ENGINE_VERSION will change the current behavior, so we'll need to release this in 0.4.0 or make a 1.0.0 release, and notify users of the change.

eregon commented 5 years ago

@postmodern Good to hear :smiley:

I think 0.4.0 would be fine, as it's not really a hard breaking change (in fact it's safer behavior, so arguably some sort of bug fix), but it will require users to reinstall gems on alternative Ruby implementations (no changes on MRI). Since for most cases reinstalling gems is just a few bundle install away, I think it's a fairly easy process.

notify users of the change.

Do you mean an entry in the changelog, and/or something else?

postmodern commented 5 years ago

Definitely a ChangeLog entry and maybe a note in the README.

eregon commented 5 years ago

@postmodern I added a ChangeLog entry and updated the document paths for GEM_HOME in the README. What do you think?

eregon commented 5 years ago

@postmodern Does the PR look good now? Could you merge it?

eregon commented 5 years ago

@postmodern @havenwood This is becoming a real problem for TruffleRuby users, so I would like to merge this or use RbConfig::CONFIG['ruby_version']. Related: https://github.com/oracle/truffleruby/issues/1715

I think chruby (and Ruby managers in general, and Bundler, and RubyGems) should use either:

For chruby, I would think RUBY_ENGINE_VERSION is a nicer transition as it wouldn't change anything for MRI users, only for JRuby/TruffleRuby/Rubinius users which would need to reinstall their gems when upgrading chruby.

A third way would be to just use RubyGems' Gem.default_path, which is basically [Gem.default_dir, Gem.user_dir], and swap the order:

# With no environment, the defaults are:
  - GEM PATHS:
     - /home/eregon/.rubies/ruby-2.6.2/lib/ruby/gems/2.6.0
     - /home/eregon/.gem/ruby/2.6.0
# With chruby:
  - GEM PATHS:
     - /home/eregon/.gem/ruby/2.6.2
     - /home/eregon/.rubies/ruby-2.6.2/lib/ruby/gems/2.6.0

Or maybe we should really solve this in RubyGems by defaulting to the user directory first, but I think that is going to be a long battle.

havenwood commented 5 years ago

I'd love to use RbConfig::CONFIG['ruby_version'] and go by pseudo-ABI, but it seems that's blocked until RubyGems supports --env-shebang by default.

I think this PR is the best path for now.

havenwood commented 5 years ago

We could also try to go the JRuby route with CRuby and TruffleRuby and enable --env-shebang upstream. It makes sense to me to make the change in RubyGems, but it unfortunately didn't make the RubyGem 3.0 release so it'll likely have to wait until 4.0 on the RubyGems side.

eregon commented 5 years ago

I gave this more thought over the weekend, and actually I think the best way would be use the directory name used by chruby for the user gem home. For instance, ruby-2.6.3 would store gems in ~/.gem/ruby-2.6.3.

That way, even if I have ruby-2.6.1-enabled-shared (configured with --enabled-shared) and ruby-2.6.1, they won't share the gem home, which is actually necessary as extensions compile differently with that setting and probably other settings too. Doing that would fix both using MRI with different configure options and not mix gems from multiple ABI versions in the same directory.

@havenwood @postmodern Would you be OK with that? I'll make a PR soon with that idea.

postmodern commented 4 years ago

@eregon using the ruby install directory name for the gem home directory name is an interesting idea. This might also require compiling rubies with different installation directories (ex: ~/.rubies/ruby-2.6.3-shared vs ~/.rubies/ruby2.6.3) into differently named src directories.

I would also like to point out that RubyGems now stores the C extension library files in sub-directories based on the host architecture, Ruby ABI version, and static vs --enable-shared.

$ ruby-install ruby
...
$ ruby-install --install-dir ~/.rubies/ruby-enable-shared ruby -- --enable-shared
...
$ # in a new terminal
$ chruby ruby-2.6.3
$ gem install nokogiri
...
$ chruby ruby-enable-shared
$ gem install nokogiri
...
$ tree .gem/ruby/2.6.3/extensions/
.gem/ruby/2.6.3/extensions/
└── x86_64-linux
    ├── 2.6.0
    │   └── nokogiri-1.10.3
    │       ├── gem.build_complete
    │       ├── gem_make.out
    │       ├── mkmf.log
    │       └── nokogiri
    │           └── nokogiri.so
    └── 2.6.0-static
        └── nokogiri-1.10.3
            ├── gem.build_complete
            ├── gem_make.out
            ├── mkmf.log
            └── nokogiri
                └── nokogiri.so

This does now allow rubygems to keep C extensions for different Ruby ABI versions or configurations separate. The only remaining issue are the absolute env-shebang which will explicitly load the ruby which was used to install the gem last.

eregon commented 4 years ago

@postmodern Very interesting, I did not know RubyGems uses different directories per ABI version. That should help to solve the problem of TruffleRuby now that we set RbConfig::CONFIG['ruby_version'] properly.

FWIW, here is the current bug:

$ chruby truffleruby-19.0.0
$ gem install msgpack
$ ruby -rmsgpack -e0
OK
$ chruby truffleruby-19.1.0
$ ruby -rmsgpack -e0       
~/.rubies/truffleruby-19.1.0/lib/mri/rubygems/core_ext/kernel_require.rb:54:in `require': Global variable rb_tr_true is declared but not defined. (RuntimeError)
    from ~/.rubies/truffleruby-19.1.0/lib/mri/rubygems/core_ext/kernel_require.rb:54:in `require'
    from ~/.gem/truffleruby/2.6.2/gems/msgpack-1.3.0/lib/msgpack.rb:11:in `<top (required)>'
    from ~/.rubies/truffleruby-19.1.0/lib/mri/rubygems/core_ext/kernel_require.rb:130:in `require'
    from ~/.rubies/truffleruby-19.1.0/lib/mri/rubygems/core_ext/kernel_require.rb:130:in `require'
    from -e:1:in `require'
    from -e:1:in `<main>'

Because both use the same compiled extension:

~/.gem/truffleruby/2.6.2/extensions/x86_64-linux/2.6.0/msgpack-1.3.0/msgpack/msgpack.su

Here is what happens when using latest master, which sets RbConfig::CONFIG['ruby_version'] properly:

$ chruby truffleruby-jvm

$ gem list | grep msgpack
Ignoring msgpack-1.3.0 because its extensions are not built. Try: gem pristine msgpack --version 1.3.0
msgpack (1.3.0)

$ ruby -rmsgpack -e0    
Ignoring msgpack-1.3.0 because its extensions are not built. Try: gem pristine msgpack --version 1.3.0
~/code/truffleruby-ws/truffleruby/mxbuild/truffleruby-jvm/jre/languages/ruby/lib/mri/rubygems/core_ext/kernel_require.rb:54:in `require': cannot load such file -- msgpack (LoadError)
    from ~/code/truffleruby-ws/truffleruby/mxbuild/truffleruby-jvm/jre/languages/ruby/lib/mri/rubygems/core_ext/kernel_require.rb:54:in `require'
    from -e:1:in `require'
    from -e:1:in `<main>'

So it does list msgpack in gem list, but yet it warns the extension is not compiled. gem pristine msgpack --version 1.3.0 does fix it, but it's very cumbersome to do that for many gems. So that part is not very nice behavior. I think it's probably still nicer for a user experience to separate gem homes for different Ruby installations or versions. And as you said, the absolute shebang is still an issue. But at least, the good news is the error is clearer than expected.

eregon commented 4 years ago

@postmodern @havenwood I've been playing with a radical idea: not setting any GEM_ variable. I've been using that for over a week now. This actually works great when rubies are installed under the $HOME (e.g., in ~/.rubies), as one can just install gems without sudo, and it just works: gems are installed inside the Ruby's prefix, so are guaranteed to be separate for each installed Ruby, and gem binaries get installed next to ruby in $RUBY_ROOT/bin. FWIW, rbenv also doesn't set GEM_* variables.

It's really awesome for developing TruffleRuby, because in that case I like to have MRI chosen with chruby, but I also want to be able to install gems on MRI and on TruffleRuby (via tool/jt.rb ruby in the truffleruby repo), without needing to switch the current ruby with chruby. This is super convenient to compare MRI and TruffleRuby behavior in the same shell. This doesn't work if GEM_HOME is set as gems would get mixed in the same home, and so it requires a lot of workaround like removing GEM_HOME in the wrapper script to run TruffleRuby in development. That issue is partially documented in https://github.com/oracle/truffleruby/blob/master/doc/user/ruby-managers.md#using-truffleruby-without-a-ruby-manager

And of course, the diff is pretty nice and makes chruby reach another level of simplicity and minimalism, by letting RubyGems use its defaults: https://github.com/eregon/chruby/commit/a9555c9a7d3c69ab70b9ee4a112fbf2da084467f Notably, chruby $MYRUBY doesn't even need to startup the given ruby, which makes chruby faster, especially for Rubies with a slower startup.

Now, the open issues with that approach:

What do you think? Do you think this could be a good approach in chruby? Or is it too incompatible? What specifically would be too incompatible?

postmodern commented 4 years ago

@eregon just so I'm totally crystal clear, what exactly is causing that rb_tr_true error in your first example? If both TruffleRuby 19.0 and 19.1 have the same RUBY_VERSION (thus sharing the same msgpack extension file), shouldn't their CRuby ABI's be roughly the same? Are you absolutely certain this isn't an issue with TruffleRuby's CRuby ABI layer?


Not setting GEM_HOME and letting RubyGems install into the ruby's gem dir will clearly not work. Rubies not installed by the user (aka /opt/rubies/ and sudo ruby-install ...) are non-writable. Furthermore, there is value in keeping user-installed gems separate from the ruby, in case you need to delete/re-install one or the other.

We could derive GEM_HOME from RUBY_ROOT, so that each Ruby would get it's own gem user-install directory. Although, it would prevent Rubies of the same RUBY_ENGINE and RUBY_VERSION from sharing gems; provided the ruby's ABI doesn't break and gem pristine --extensions was ran and --env-shebang was enabled... This is kind of an exotic feature though, so we could axe it in favor of stricter gem dir separation. Personally, I would prefer using RbConfig::CONFIG['ruby_version'] in the GEM_HOME to enable gem sharing/recycling between versions or security upgrades; having to reinstall your gems because you upgraded from x.y.4 to x.y.5 is annoying.

I also feel the need to point out chruby explicitly forbids Ruby-specific workarounds. Any solution we come up with has to work with all (current/stable) Rubies out-of-the-box and all possible setups (such as non-user-writable rubies in /opt/rubies/).

eregon commented 4 years ago

@eregon just so I'm totally crystal clear, what exactly is causing that rb_tr_true error in your first example?

They have the same RUBY_VERSION, which is 2.6.2, aka the version of the Ruby language they are compatible with. RUBY_VERSION is not the ABI version though. RbConfig::CONFIG['ruby_version'] is the closest to an ABI version, and it's what is used by RubyGems and Bundler. For practical purposes, the TruffleRuby's ABI changes for every release (e.g., macros are defined differently).

RbConfig::CONFIG['ruby_version'] was 2.6.0 in both 19.0 and 19.1 though, which indeed is a bug (now fixed): https://github.com/oracle/truffleruby/issues/1715 With that fixed, I get Ignoring msgpack-1.3.0 because its extensions are not built, showing chruby should have different gem homes for different ABI versions or for each installed Ruby.


Not setting GEM_HOME and letting RubyGems install into the ruby's gem dir will clearly not work. Rubies not installed by the user (aka /opt/rubies/ and sudo ruby-install ...) are non-writable.

Right, that's issue 1 above, and I think the most problematic. It works fine for me, because I never install a ruby version globally, but there are other use cases that ruby-install and chruby support so I understand we cannot do it. It's OK, I'll just use my branch for now.

Passing --user to gem install is a workaround supported directly in RubyGems though, but I agree it's not as practical. What if RubyGems automatically installed in a user directory when it detects it cannot write to the Ruby's prefix? It could be an idea, but obviously that wouldn't help for older Rubies and RubyGems. If we fixed this in RubyGems, I'd be tempted to simply install to the user dir by default, but changing defaults is likely not going to be easy.

In summary, it's sad that RubyGems defaults are not convenient and even tend to force using sudo for gem install which seems very risky and is inconvenient.

Furthermore, there is value in keeping user-installed gems separate from the ruby, in case you need to delete/re-install one or the other.

Which is issue 3 above, indeed.

Although, it would prevent Rubies of the same RUBY_ENGINE and RUBY_VERSION from sharing gems; provided the ruby's ABI doesn't break and gem pristine --extensions was ran and --env-shebang was enabled... This is kind of an exotic feature though, so we could axe it in favor of stricter gem dir separation.

That sounds very exotic, already just by having 2 MRIs of the same RUBY_VERSION (and for non-MRI RUBY_VERSION does not relate to ABI). It would also not work well if one is --enable-shared and the other not. I believe anyone who had to use gem pristine found it inconvenient and confusing.

This stricter gem separation is what both RVM and rbenv do (RVM by setting GEM_HOME based on the Ruby name, rbenv by not setting GEM_HOME and not supporting system-wide Rubies). IMHO, it's the safe and easy to understand way.

Personally, I would prefer using RbConfig::CONFIG['ruby_version'] in the GEM_HOME to enable gem sharing/recycling between versions or security upgrades; having to reinstall your gems because you upgraded from x.y.4 to x.y.5 is annoying.

That would be fine too, except that one would also needs to care about --enable-shared. And I would think other ./configure flags like --enable-shared can influence C-extension compilation (e.g., some debug flag that adds code in a macro), and make them ABI-incompatible.

And of course, env-shebang still blocks this (and forever will by the the rule below). Unless maybe we'd write install: --env-shebang to ~/.gemrc?

I also feel the need to point out chruby explicitly forbids Ruby-specific workarounds. Any solution we come up with has to work with all (current/stable) Rubies out-of-the-box and all possible setups (such as non-user-writable rubies in /opt/rubies/).

Right. The line is a bit blurry to me though because, e.g., Object.const_defined?(:RUBY_ENGINE) is basically MRuby-specific and not so different from if Gem executable dir != prefix/bin.


I am sorry for this very long thread. I think I got a pretty full understanding of the situation now. The current gem dir based on RUBY_VERSION is suboptimal, because it leads to gem pristine warnings and non-usable executables for non-MRI Ruby implementations which use the same RUBY_VERSION for multiple releases. The only safe solution forward I see, to still support existing Rubies and system-wide Rubies, is to have one gem home per installed Ruby, i.e., base the gem home on the ruby's name (the name from chruby's list). Incidentally, it would also solve the gem home issue I had 6 years ago with chruby #186.

postmodern commented 4 years ago

RbConfig::CONFIG['ruby_version'] was 2.6.0 in both 19.0 and 19.1 though, which indeed is a bug (now fixed): oracle/truffleruby#1715

If the problem is fixed upstream, I always advocate for releasing a new version, encouraging users to upgrade, and moving forward.

With that fixed, I get Ignoring msgpack-1.3.0 because its extensions are not built, showing chruby should have different gem homes for different ABI versions or for each installed Ruby.

This definitely can seem confusing since the user may not know the gem dir is reused, and may require gem pristine --extensions --executables. We should shift the discussion to what's more predictable and less annoying to the users.

What if RubyGems automatically installed in a user directory when it detects it cannot write to the Ruby's prefix?

I have advocated for this in the past, but to no avail. Imo, it makes sense that if someone runs gem install as a regular user, it should install into the user's gem home. If they run sudo gem install it should install into the "system gem home"; not that sudo runs in a separate shell environment thus bypassing the user's PATH, GEM_HOME, etc.

If we fixed this in RubyGems, I'd be tempted to simply install to the user dir by default, but changing defaults is likely not going to be easy.

While I am pro-upgrading and moving forward, unfortunately it would require a lot of work (PR and proving it would not break backwards compatibility) for RubyGems to consider accepting the feature.

It would also not work well if one is --enable-shared and the other not. I believe anyone who had to use gem pristine found it inconvenient and confusing.

I already pointed out that RubyGems uses different directories for shared vs static built extensions. Extensions built against Rubies without --enable-shared are installed into GEM_HOME/extensions/ARCH-OS/RUBY_ABI_VERSION-static directory, where as extensions built against --enabled-shared Rubies lack the -static suffix on their extensions directory. While gem pristine is well documented, it isn't frequently used, unless you have to rebuild or fix your gems.

That would be fine too, except that one would also needs to care about --enable-shared.

Again, this doesn't seem to be an issue. Unless you are using a much older version of RubyGems prior to the extensions/ directory compartmentalization?

Right. The line is a bit blurry to me though because, e.g., Object.const_defined?(:RUBY_ENGINE) is basically MRuby-specific and not so different from if Gem executable dir != prefix/bin.

Technically true, although all Rubies supported Object.defined? and it didn't change the existing behavior or introduce new functionality. The original logic (262a9f1fbcae755dc2e3b1269fca9f0b036531fb) used defined?(:RUBY_ENGINE) because RUBY_ENGINE was added in CRuby 1.9.1, and I wanted to support 1.8.7 at the time. I fail to see how pointing out this minor detail adds to the discussion, which should be focused on gem installation directories.


The only safe solution forward I see, to still support existing Rubies and system-wide Rubies, is to have one gem home per installed Ruby, i.e., base the gem home on the ruby's name (the name from chruby's list).

It is looking like this is the consensus. However, there are two downsides that we must recognize. If each Ruby gets it's own separate gem home...

  1. users will have to re-install their gems after upgrading to a new patch-release version (i.e. security update). This would require knowing the list of gems you wish to re-install or manually mving the gem dir and running gem pristine.
  2. users will likely complain to chruby about used up disk space due to orphaned gem dirs. This would require remembering to rm -rf old gem dirs after manually uninstalling rubies.

I understand your desire to fix this for TruffleRuby users, but we must consider the negative impact on other users. How do we mitigate these negative impacts? Possibly a separate script for managing gem installation directories? Thoughts?

eregon commented 4 years ago

If the problem is fixed upstream, I always advocate for releasing a new version, encouraging users to upgrade, and moving forward.

Of course, we should have 19.2.0 soon with that fix.

I have advocated for this in the past, but to no avail.

Did you discuss this on the RubyGems tracker or so? That idea sounds great to me, I'd be curious why some people disagree.

It would also not work well if one is --enable-shared and the other not. I believe anyone who had to use gem pristine found it inconvenient and confusing.

I already pointed out that RubyGems uses different directories for shared vs static built extensions.

Yes I know, I meant even though RubyGems doesn't mix them up, it causes gem pristine warnings, which is a less-than-ideal experience. And the complete deal killer for sharing a gem home between 2 Rubies is gem executables having an absolute path by default (no --env-shebang), so only working for one of the 2 Rubies.

It is looking like this is the consensus. However, there are two downsides that we must recognize. If each Ruby gets it's own separate gem home...

  1. users will have to re-install their gems after upgrading to a new patch-release version (i.e. security update). This would require knowing the list of gems you wish to re-install or manually mving the gem dir and running gem pristine.

Doesn't RubyGems have a command to export a list of installed gems, that can be easily imported with RubyGems again?

  1. users will likely complain to chruby about used up disk space due to orphaned gem dirs. This would require remembering to rm -rf old gem dirs after manually uninstalling rubies.

By patch-release version, I assume you mean Z in X.Y.Z (not the old pNNN which is no longer used), like 2.6.1 -> 2.6.2.

This is already the case, every installation of MRI already uses its own gem directory, so at least it's not any worse than before for MRI, and it's correct for other implementations. In which case is it any worse than currently?

eregon commented 4 years ago

Closing this PR in favor of https://github.com/postmodern/chruby/pull/419