postmodern / ruby-install

Installs Ruby, JRuby, TruffleRuby, or mruby
MIT License
1.91k stars 255 forks source link

Remove the installation directory if it already exists and ends with the ruby name #379

Closed eregon closed 4 years ago

eregon commented 4 years ago

(from https://github.com/postmodern/ruby-install/pull/377#issuecomment-701342011)

eregon commented 4 years ago

@postmodern This is a better version of what I had in https://github.com/postmodern/ruby-install/pull/377#issuecomment-701342011, can you review? I think it would be really useful and avoid very complex issues from leftover files when reinstalling a Ruby. One simple example I can think of is extra files in site_ruby due to gem update --system that would affect even after reinstalling, which is unexpected. It's also very useful when testing ruby-install and trying to install the same Ruby repeatedly, which is how I noticed this issue.

postmodern commented 4 years ago

@eregon could you give me a reproducible example of the installation directory containing old files that do not get overwritten or removed during a re-installation?

eregon commented 4 years ago

For instance you can do this:

ruby-install truffleruby+graalvm
touch ~/.rubies/truffleruby+graalvm-20.2.0/foo
ruby-install truffleruby+graalvm
ls -l ~/.rubies/truffleruby+graalvm-20.2.0/foo # => still exists

This is to illustrate and keep it simple, but in practice you could have leftover gem files (including some for which compilation failed), leftover files from recompiling the openssl C extension when installing, leftover files under site_ruby from gem update --system, etc. I believe none of them are expected, the reinstallation should be as clean as possible and avoid keeping state from the previous installation.

Also extract() seems to assume $dest/archive_name does not exist, because if it does you'll end up with $dest/archive_name/archive_name: https://github.com/postmodern/ruby-install/blob/c071fdbd15790e697ed6a87e84fcea35f018aa98/share/ruby-install/util.sh#L115-L117 That's probably something we should check in extract() for robustness and error if the destination exists. I can add that check this PR if that's OK?

eregon commented 4 years ago

Actually it's simpler than that, just this fails:

$ bin/ruby-install truffleruby+graalvm
OK
$ bin/ruby-install truffleruby+graalvm
>>> Installing truffleruby+graalvm 20.2.0 into /home/eregon/.rubies/truffleruby+graalvm-20.2.0 ...
>>> Installing dependencies for truffleruby+graalvm 20.2.0 ...
Last metadata expiration check: 0:20:51 ago on Sun 04 Oct 2020 04:01:35 PM CEST.
Package zlib-devel-1.2.11-20.fc31.x86_64 is already installed.
Package openssl-devel-1:1.1.1g-1.fc31.x86_64 is already installed.
Package make-1:4.2.1-15.fc31.x86_64 is already installed.
Package gcc-9.3.1-2.fc31.x86_64 is already installed.
Package libxml2-2.9.10-3.fc31.x86_64 is already installed.
Package libxml2-2.9.10-3.fc31.i686 is already installed.
Dependencies resolved.
Nothing to do.
Complete!
>>> Downloading https://github.com/graalvm/graalvm-ce-builds/releases/download/vm-20.2.0/graalvm-ce-java8-linux-amd64-20.2.0.tar.gz into /home/eregon/src ...
>>> Verifying graalvm-ce-java8-linux-amd64-20.2.0.tar.gz ...
>>> Extracting graalvm-ce-java8-linux-amd64-20.2.0.tar.gz to /home/eregon/src/graalvm-ce-java8-20.2.0 ...
>>> Installing GraalVM 20.2.0 ...
mkdir: cannot create directory ‘/home/eregon/.rubies/truffleruby+graalvm-20.2.0’: File exists
!!! Installation of truffleruby+graalvm 20.2.0 failed!
zsh: exit 255   bin/ruby-install truffleruby+graalvm

And if you remove the mkdir "$install_dir" || return $? line in share/ruby-install/truffleruby+graalvm/functions.sh, then one gets:

bin/ruby-install truffleruby+graalvm                       
>>> Installing truffleruby+graalvm 20.2.0 into /home/eregon/.rubies/truffleruby+graalvm-20.2.0 ...
>>> Installing dependencies for truffleruby+graalvm 20.2.0 ...
Last metadata expiration check: 0:22:25 ago on Sun 04 Oct 2020 04:01:35 PM CEST.
Package zlib-devel-1.2.11-20.fc31.x86_64 is already installed.
Package openssl-devel-1:1.1.1g-1.fc31.x86_64 is already installed.
Package make-1:4.2.1-15.fc31.x86_64 is already installed.
Package gcc-9.3.1-2.fc31.x86_64 is already installed.
Package libxml2-2.9.10-3.fc31.x86_64 is already installed.
Package libxml2-2.9.10-3.fc31.i686 is already installed.
Dependencies resolved.
Nothing to do.
Complete!
>>> Downloading https://github.com/graalvm/graalvm-ce-builds/releases/download/vm-20.2.0/graalvm-ce-java8-linux-amd64-20.2.0.tar.gz into /home/eregon/src ...
>>> Verifying graalvm-ce-java8-linux-amd64-20.2.0.tar.gz ...
>>> Extracting graalvm-ce-java8-linux-amd64-20.2.0.tar.gz to /home/eregon/src/graalvm-ce-java8-20.2.0 ...
>>> Installing GraalVM 20.2.0 ...
>>> Installing the Ruby component ...
Downloading: Component catalog from www.graalvm.org
Processing Component: TruffleRuby
Component TruffleRuby (org.graalvm.ruby) is already installed.
ln: failed to create symbolic link '/home/eregon/.rubies/truffleruby+graalvm-20.2.0/graalvm/jre/languages/ruby/bin/gu': File exists
!!! Post-install tasks failed!
zsh: exit 255   bin/ruby-install truffleruby+graalvm

That notably creates truffleruby+graalvm-20.2.0/graalvm/graalvm-ce-java8-20.2.0 instead of truffleruby+graalvm-20.2.0/graalvm because truffleruby+graalvm-20.2.0/graalvm already exists.

That's the kind of problems I think we never want to deal with. Failing early like that mkdir is good. But reinstalling should also just work if we can safely remove the previous directory.

eregon commented 4 years ago

If you want an example with another Ruby, I'd think this would keep leftover files:

ruby-install ruby 2.6.6
chruby 2.6.6
gem --version # original
gem update --system
gem --version # latest

chruby system
ruby-install ruby 2.6.6
chruby 2.6.6
gem --version # latest instead of original (imagine e.g. latest RubyGems is broken on 2.6.6, this becomes a problem)
postmodern commented 4 years ago

First Example: I think it's out of scope for ruby-install to undo random changes users might make to installed rubies. ruby-install installs the ruby, if the user adds files then that's their choice. It is not ruby-install's purpose to protect yourself from yourself.

Second Example: this appears to be an issue with extracting into a pre-existing directory. This is a valid but separate issue.

Third Example: this is actually due to chruby setting GEM_HOME to ~/.gem/... and that rubygems gem update --system will install into $GEM_HOME, instead of the rubygems root directory within the Ruby installation directory. More of a chruby issue than a ruby-install issue.

eregon commented 4 years ago
  1. Agreed on random changes. But removing the Ruby prefix directory, if we know it's just for that Ruby seems very reasonable to me.

  2. How do we fix that one? extract_ruby()/extract() should remove the destination if it exists? Isn't it cleaner the way this PR does it?

  3. I tried locally with 2.6.3. gem update --system ignores GEM_HOME, etc. It changes ~/.rubies/ruby-2.6.3/lib/ruby/site_ruby/2.6.0.

    
    $ ruby-install ruby 2.6.3
    In a new terminal
    $ chruby 2.6.3
    $ ruby -e 'puts $:'
    /home/eregon/.rubies/ruby-2.6.3/lib/ruby/gems/2.6.0/gems/did_you_mean-1.3.0/lib
    /home/eregon/.rubies/ruby-2.6.3/lib/ruby/site_ruby/2.6.0
    /home/eregon/.rubies/ruby-2.6.3/lib/ruby/site_ruby/2.6.0/x86_64-linux
    /home/eregon/.rubies/ruby-2.6.3/lib/ruby/site_ruby
    /home/eregon/.rubies/ruby-2.6.3/lib/ruby/vendor_ruby/2.6.0
    /home/eregon/.rubies/ruby-2.6.3/lib/ruby/vendor_ruby/2.6.0/x86_64-linux
    /home/eregon/.rubies/ruby-2.6.3/lib/ruby/vendor_ruby
    /home/eregon/.rubies/ruby-2.6.3/lib/ruby/2.6.0
    /home/eregon/.rubies/ruby-2.6.3/lib/ruby/2.6.0/x86_64-linux
    $ echo $GEM_HOME 
    /home/eregon/.gem/ruby/2.6.3
    $ gem --version
    3.0.3
    $ gem update --system
    $ gem --version      
    3.1.4
    $ ruby -e 'puts $".grep(/rubygems/)' | head -1
    /home/eregon/.rubies/ruby-2.6.3/lib/ruby/site_ruby/2.6.0/rubygems/compatibility.rb

$ ruby-install ruby 2.6.3 In a new terminal $ chruby 2.6.3 $ gem --version 3.1.4 # => leaked state from previous installation


Did I convince you? :smiley: 

I think the only alternative is to fail on reinstall if the installation directory already exists and asks the user to `rm -rf` it, but it doesn't seem better at all to me.
postmodern commented 4 years ago

I am still hesitant about choosing to rm -rf (rm -r would technically be safer) the installation directory for the user. I think this might be more of a use-case for an uninstall step that safely removes all installed files and any additional "cache" directories. We could even use CRuby's make uninstall, if the src directory already exists prior to installation.

eregon commented 4 years ago

The reason I used rm -rf in this PR is I have type rm => rm is an alias for rm -i in my shell, so without -f it would ask me whether to remove each file one by one. We just established a line above that the directory exists, so rm -rf doesn't seem any less safe than rm -r.

Other Rubies don't have make uninstall, and the source directory might be gone, so that seems less reliable to me.

postmodern commented 4 years ago

This need seems too specific to your environment. It is ruby-install's philosophy to simply perform the manual steps necessary to install a ruby and nothing more. If the user wishes to remove the ruby directory entirely before re-installing, they can choose to run rm -rf the directory themselves; they may or may not always want to rm -rf the directory, so I don't want to overrule the user's preference.

I am more interested in support an uninstall step that cleanly removes the files.

The other issues in the truffleruby+graalvm functions.sh have been fixed in 7ae19e73806887ee7a69c4ad3a2beabf3efa8500.

eregon commented 4 years ago

What this means is ruby-install, if the install dir already exists, merges random files from before with the new installation, resulting in state that might be highly confusing/incompatible/broken and clearly nobody would want to debug that. It's not specific to my environment at all, as I have demonstrated with the gem update --system example, which is a frequent command Rubyists use. Imagine the latest RubyGems release is broken on some Ruby release for some reason (e.g., there are so many Ruby releases it's not reasonable to test all of them), then a ruby-install user which wants a clean install and does ruby-install ruby X.Y.Z will actually get the same broken state, or a more complex broken mix of files. They will have to know to rm -rf ~/.rubies/ruby-X.Y.Z manually, or ruby-install won't even warn them they are installing in a preexisting file hierarchy which might contain random state from the previous installation.

For me, this is about reliability. I think ruby-install is unreliable without this PR. What might avoid this issue in practice is it's probably rare that people need to reinstall Ruby. But a few external issues could trigger that a lot more.

Maybe it's useful to get a third opinion on this. @havenwood WDYT?

eregon commented 4 years ago

@postmodern BTW, to show how brittle it is to install into an existing directory, here is a bug, after your fixes, on current 0.8.0 (24a49a6be42d115bab872950e5661d091e5f352c):

$ bin/ruby-install truffleruby+graalvm
OK
$ bin/ruby-install truffleruby+graalvm
~/.rubies/truffleruby+graalvm-20.2.0/graalvm/graalvm-ce-java8-20.2.0 exists, and is a second copy of GraalVM, ~doubling the size of the installation
postmodern commented 4 years ago

@eregon found the issue (see fd724f66c339981f40c44d731dfd0aeaa85dfe49). It was caused by cp -R copying the whole graalvm-ce-java8-20.2.0 directory into $install_dir/graalvm when $install_dir/graalvm already existed. Again, I would rather have ruby-install act correctly, then add a klugish rm -rf workaround.

I also noticed another issue with graalvm. The tar.gz release archives contain files with -r--r--r--. This prevents any re-installation where the file would be copied over the existing one. I will submit an issue back to graalvm about that.

$ tar -tzvf ~/src/graalvm-ce-java8-linux-amd64-20.2.0.tar.gz | grep -- '-r--r--r--'
-r--r--r-- root/root      1522 2020-07-16 11:56 graalvm-ce-java8-20.2.0/ASSEMBLY_EXCEPTION
-r--r--r-- root/root    152996 2020-07-16 11:56 graalvm-ce-java8-20.2.0/jre/THIRD_PARTY_README
-r--r--r-- root/root     19274 2020-07-16 11:56 graalvm-ce-java8-20.2.0/jre/LICENSE
-r--r--r-- root/root      1522 2020-07-16 11:56 graalvm-ce-java8-20.2.0/jre/ASSEMBLY_EXCEPTION
-r--r--r-- root/root      2856 2020-07-16 11:56 graalvm-ce-java8-20.2.0/jre/lib/management/jmxremote.password.template
-r--r--r-- root/root      3376 2020-07-16 11:56 graalvm-ce-java8-20.2.0/jre/lib/management/snmp.acl.template
-r--r--r-- root/root      6876 2020-07-16 11:56 graalvm-ce-java8-20.2.0/jre/lib/cmm/sRGB.pf
-r--r--r-- root/root       488 2020-07-16 11:56 graalvm-ce-java8-20.2.0/jre/lib/cmm/LINEAR_RGB.pf
-r--r--r-- root/root       784 2020-07-16 11:56 graalvm-ce-java8-20.2.0/jre/lib/cmm/CIEXYZ.pf
-r--r--r-- root/root    234080 2020-07-16 11:56 graalvm-ce-java8-20.2.0/jre/lib/cmm/PYCC.pf
-r--r--r-- root/root       556 2020-07-16 11:56 graalvm-ce-java8-20.2.0/jre/lib/cmm/GRAY.pf
-r--r--r-- root/root      4102 2020-07-16 11:56 graalvm-ce-java8-20.2.0/jre/lib/currency.data
-r--r--r-- root/root    152996 2020-07-16 11:56 graalvm-ce-java8-20.2.0/THIRD_PARTY_README_JDK
postmodern commented 4 years ago

@eregon so I tried to reproduce your 2.6.3 / chruby / gem update --system example, but without ruby-install or chruby, to determine what the default behavior of CRuby was:

#
# in build terminal
#
$ wget https://cache.ruby-lang.org/pub/ruby/2.6/ruby-2.6.3.tar.bz2
$ tar -xjvf ruby-2.6.3.tar.bz2
$ cd ruby-2.6.3
$ ./configure --prefix="$HOME/.rubies/test-ruby-reinstall-2.6.3"
$ make
$ make install

#
# in test terminal
#
$ ~/.rubies/test-ruby-reinstall-2.6.3/bin/gem -v
3.0.3
$ ~/.rubies/test-ruby-reinstall-2.6.3/bin/gem update --system
...
$ ~/.rubies/test-ruby-reinstall-2.6.3/bin/gem -v
3.1.4

#
# back in build terminal
#
$ make install

$ back in test terminal
$ ~/.rubies/test-ruby-reinstall-2.6.3/bin/gem -v
3.1.4
$ ~/.rubies/test-ruby-reinstall-2.6.3/bin/gem list

Now let's test ruby gems with C extensions:

#
# in test terminal
#
$ ~/.rubies/test-ruby-reinstall-2.6.3/bin/gem list zlib

*** LOCAL GEMS ***

zlib (default: 1.0.0)
$ ~/.rubies/test-ruby-reinstall-2.6.3/bin/ruby -r zlib -e 'p Zlib.inflate(Zlib.deflate("testing"))'
"testing"
$ ~/.rubies/test-ruby-reinstall-2.6.3/bin/gem update zlib
Updating installed gems
Updating zlib
Fetching zlib-1.1.0.gem
Building native extensions. This could take a while...
Successfully installed zlib-1.1.0
Parsing documentation for zlib-1.1.0
Installing ri documentation for zlib-1.1.0
Installing darkfish documentation for zlib-1.1.0
Done installing documentation for zlib after 0 seconds
Parsing documentation for zlib-1.1.0
Done installing documentation for zlib after 0 seconds
Gems updated: zlib

#
# back in build terminal
#
$ make install

#
# back in test terminal
#
$ ~/.rubies/test-ruby-reinstall-2.6.3/bin/gem list zlib

*** LOCAL GEMS ***

zlib (1.1.0, default: 1.0.0)
$ ~/.rubies/test-ruby-reinstall-2.6.3/bin/ruby -r zlib -e 'p Zlib.inflate(Zlib.deflate("testing"))'
"testing"

As you can see after re-installing the ruby, even after updating built-in gems, everything works perfectly! Not clearing the rubygems directory during re-installation is CRuby's expected behavior and nothing appears to break.

Now, is the fact that the gems which we updated or installed are still present after re-installing CRuby a bit surprising? Maybe. Is CRuby stuck in some sort of "incompatible" or "broken" state? No, it is perfectly usable.

Here is an example of me installing rails and generating a bare bones API app:

$ ~/.rubies/test-ruby-reinstall-2.6.3/bin/gem install rails
...
$ ~/.rubies/test-ruby-reinstall-2.6.3/bin/rails new --api hello_world
...
$ cd hello_world
$ ~/.rubies/test-ruby-reinstall-2.6.3/bin/rails server
=> Booting Puma
=> Rails 6.0.3.4 application starting in development 
=> Run `rails server --help` for more startup options
Puma starting in single mode...
* Version 4.3.6 (ruby 2.6.3-p62), codename: Mysterious Traveller
* Min threads: 5, max threads: 5
* Environment: development
* Listening on tcp://127.0.0.1:3000
* Listening on tcp://[::1]:3000
Use Ctrl-C to stop
Started GET "/" for ::1 at 2020-10-10 00:37:12 -0700
   (4.5ms)  SELECT sqlite_version(*)
Processing by Rails::WelcomeController#index as */*
  Rendering /home/postmodern/.rubies/test-ruby-reinstall-2.6.3/lib/ruby/gems/2.6.0/gems/railties-6.0.3.4/lib/rails/templates/rails/welcome/index.html.erb
  Rendered /home/postmodern/.rubies/test-ruby-reinstall-2.6.3/lib/ruby/gems/2.6.0/gems/railties-6.0.3.4/lib/rails/templates/rails/welcome/index.html.erb (Duration: 7.5ms | Allocations: 485)
Completed 200 OK in 13ms (Views: 10.3ms | ActiveRecord: 0.0ms | Allocations: 2703)

^C

#
# back in build terminal
#
$ make install

#
# back in test terminal
#
$ ~/.rubies/test-ruby-reinstall-2.6.3/bin/rails server
=> Booting Puma
=> Rails 6.0.3.4 application starting in development 
=> Run `rails server --help` for more startup options
Puma starting in single mode...
* Version 4.3.6 (ruby 2.6.3-p62), codename: Mysterious Traveller
* Min threads: 5, max threads: 5
* Environment: development
* Listening on tcp://127.0.0.1:3000
* Listening on tcp://[::1]:3000
Use Ctrl-C to stop
Started GET "/" for ::1 at 2020-10-10 00:38:58 -0700
   (0.7ms)  SELECT sqlite_version(*)
Processing by Rails::WelcomeController#index as */*
  Rendering /home/postmodern/.rubies/test-ruby-reinstall-2.6.3/lib/ruby/gems/2.6.0/gems/railties-6.0.3.4/lib/rails/templates/rails/welcome/index.html.erb
  Rendered /home/postmodern/.rubies/test-ruby-reinstall-2.6.3/lib/ruby/gems/2.6.0/gems/railties-6.0.3.4/lib/rails/templates/rails/welcome/index.html.erb (Duration: 8.3ms | Allocations: 360)
Completed 200 OK in 11ms (Views: 9.7ms | ActiveRecord: 0.0ms | Allocations: 1711)

^C

Apart from the sassc gem taking forever to compile, everything worked just fine, even after re-installing multiple times. Unless you can provide us with an reproducible example that clearly shows a Ruby or RubyGems getting stuck in some kind of inconsistent or broken state, I don't really see the need to bypass the expected re-installation behavior of upstream CRuby or RubyGems.

If you still feel strongly about clearing the RubyGems directory during re-installation, then I suggest you submit an enhancement, for clearing the RubyGems directory during re-installation, to ruby-core or rubygems.

If you're still worried that CRuby or RubyGems might somehow break due to re-installation, you can simply rm -rf ~/.rubies/ruby-... the directory yourself before running ruby-install.

eregon commented 4 years ago

@eregon found the issue (see fd724f6). It was caused by cp -R copying the whole graalvm-ce-java8-20.2.0 directory into $install_dir/graalvm when $install_dir/graalvm already existed. Again, I would rather have ruby-install act correctly, then add a klugish rm -rf workaround.

From my point of view of building TruffleRuby and trying to make it work reliably for people, the recent changes in 0.8.0 are far more klugish workarounds than installing in a clean state by ensuring the install directory is empty. Maybe we could agree rm -rf $install_dir is appropriate for truffleruby & truffleruby+graalvm?

I also noticed another issue with graalvm. The tar.gz release archives contain files with -r--r--r--. This prevents any re-installation where the file would be copied over the existing one. I will submit an issue back to graalvm about that.

I think it's perfectly reasonable for files that should not be modified to be packaged as read-only. For me it's a sign the "write the files over and hope there is no bad interactions" approach is the brittle workaround and not the clean solution. Those files look like JDK files BTW, so maybe it's the same issue for standard JDK?

eregon commented 4 years ago

@eregon so I tried to reproduce your 2.6.3 / chruby / gem update --system example, but without ruby-install or chruby, to determine what the default behavior of CRuby was:

Thanks for trying it.

As you can see after re-installing the ruby, even after updating built-in gems, everything works perfectly! Not clearing the rubygems directory during re-installation is CRuby's expected behavior and nothing appears to break.

Yes, if nothing is broken it works, but then there is also no reason to resinstall. As you probably know, various Ruby users or developers often get a messy installation (especially on macOS, where both the system Ruby and Homebrew's ruby are pretty inconvenient for gems), and they might have less experience than you or me. Installing in a clean state in that case is IMHO invaluable. They might not realize they need to rm -rf manually.

Now, is the fact that the gems which we updated or installed are still present after re-installing CRuby a bit surprising? Maybe. Is CRuby stuck in some sort of "incompatible" or "broken" state? No, it is perfectly usable.

If the latest RubyGems was broken (it happened at least once in the past), or that update of zlib or some other gem was broken, then ruby-install ruby 2.6.3 will keep it broken. If it's some gem used by RubyGems, then it will probably lead to fairly confusing errors. For me that loses the point completely of re-installing. To clarify, I'm not trying to spread FUD, I'm trying to have a reliable way to install and re-install Ruby.

I know ruby-install doesn't officially support *-head, but if it did the problem would be even clearer as the set of files there might change (e.g., some revision removes a gem, it would still be there in stdlib even though it shouldn't). I guess that's part of why you don't want to support *-head versions, but for me it's a clear indication the universal way to provide a clean reinstallation, if Ruby is installed in its own prefix, is rm -rf ruby_prefix before reinstalling.

If you're still worried that CRuby or RubyGems might somehow break due to re-installation, you can simply rm -rf ~/.rubies/ruby-... the directory yourself before running ruby-install.

Yes, but if I or someone else forget everything might be messed up. I think at the very least ruby-install should warn about this. We could also maybe prompt for rm -rf $install_dir. Also the logic for all functions.sh is becoming more complicated because we don't have the guarantee the installation directory is clean, and that makes it a lot harder to ensure it's correct. Shell commands like cp/mv/ln are IMHO only sane to reason about if we know the target exist or not, if it can be both it becomes very hard to get it right (and very easy to get wrong when modifying them).

In the end, what I'll do is use & recommend ruby-build instead which has the more reliable behavior of removing the installation directory. (BTW RVM also removes the prefix when reinstalling) This discussion has made it clear strange edge cases are more important than reliability in ruby-install, and I cannot agree with that. It's your software and your decision though. FWIW, I'll keep using chruby because I think it's the cleanest and most reliable Ruby switcher.