rubygems / bundler

Manage your Ruby application's gem dependencies
https://bundler.io
MIT License
4.88k stars 2k forks source link

Freezing while bundle install #3666

Closed deepj closed 9 years ago

deepj commented 9 years ago

Something like from 1.9.8 (including 1.10.0.rc1) I've started meeting with freezing while bundle install. What it is interesting it does not matter which versions of Ruby or an implementation. It seems it there is no issue with my connection as well. It behaves same through different provider.

I've met with this under Ruby 2.1.6, Ruby 2.2.2 and JRuby 9.0.0.0.pre2 on Mac OS X 10.10.3. CPU is Intel i7-4650U (2 cores/4 threads).

I believe it is caused by parallels workers. If I change them from 4 to 0 in my global config. Then there is no issue with freezing while bundle install.

segiddins commented 9 years ago

What's the backtrace if you kill bundler while it's frozen?

deepj commented 9 years ago

There is no backtrace after a kill.

pducks32 commented 9 years ago

Does it freeze every time or randomly?

deepj commented 9 years ago

Under MRI randomly. Under JRuby very often.

deepj commented 9 years ago

Sample Gemfile

source 'https://rubygems.org'

ruby '2.1.6'

gem 'rails',       '3.2.21'
gem 'refinerycms', '2.1.5'
gem 'slim',        '3.0.3'

group :assets do
  gem 'sass-rails',        '3.2.6'
  gem 'compass-rails',     '2.0.4'
  gem 'compass-blueprint', '1.0.0'
  gem 'coffee-rails',      '3.2.2'
  gem 'uglifier',          '2.7.1'
end

group :development do
  gem 'sqlite3', '1.3.10'
  gem 'rubocop', '0.31.0', require: false
end

group :production do
  gem 'pg',             '0.18.2'
  gem 'fog',            '1.30.0'
  gem 'puma',           '2.11.2'
  gem 'dalli',          '2.7.4'
  gem 'memcachier',     '0.0.2'
  gem 'rollbar',        '1.5.2'
  gem 'newrelic_rpm',   '3.12.0.288'
  gem 'rails_12factor', '0.0.3'
end
pducks32 commented 9 years ago

What happens when you set different number of jobs? bundle install -j 1/2/3/etc

pducks32 commented 9 years ago

We really should be logging more in the installers. I'll work on adding some debugging output so that introspection is a bit easier.

deepj commented 9 years ago

I've set different numbers of jobs and the problem occurs.

With 1 job:

$ ruby -v
ruby 2.1.6p336 (2015-04-13 revision 50298) [x86_64-darwin14.0]
$ gem -v
2.4.7
$ gem install bundler
Fetching: bundler-1.9.9.gem (100%)
Successfully installed bundler-1.9.9
1 gem installed
$ time bundle install -j 1
Fetching gem metadata from https://rubygems.org/...........
Fetching version metadata from https://rubygems.org/...
Fetching dependency metadata from https://rubygems.org/..
Installing rake 10.4.2
Installing CFPropertyList 2.3.1
Installing i18n 0.7.0
Installing multi_json 1.11.0
Installing activesupport 3.2.21
^C
SystemExit: exit
An error occurred while installing builder (3.0.4), and Bundler cannot continue.
Make sure that `gem install builder -v '3.0.4'` succeeds before bundling.

real    1m18.520s
user    0m2.833s
sys     0m0.353s

Just for any occasion, here is my ping on google.com. I don't think that my internet connection wouldn't be a problem.

$ ping www.google.com
PING www.google.com (173.194.122.16): 56 data bytes
64 bytes from 173.194.122.16: icmp_seq=0 ttl=55 time=12.737 ms
64 bytes from 173.194.122.16: icmp_seq=1 ttl=55 time=14.528 ms
64 bytes from 173.194.122.16: icmp_seq=2 ttl=55 time=14.179 ms
64 bytes from 173.194.122.16: icmp_seq=3 ttl=55 time=13.129 ms
64 bytes from 173.194.122.16: icmp_seq=4 ttl=55 time=13.947 ms
64 bytes from 173.194.122.16: icmp_seq=5 ttl=55 time=14.243 ms
64 bytes from 173.194.122.16: icmp_seq=6 ttl=55 time=14.527 ms
64 bytes from 173.194.122.16: icmp_seq=7 ttl=55 time=11.995 ms
64 bytes from 173.194.122.16: icmp_seq=8 ttl=55 time=12.779 ms
64 bytes from 173.194.122.16: icmp_seq=9 ttl=55 time=11.458 ms
64 bytes from 173.194.122.16: icmp_seq=10 ttl=55 time=13.596 ms
64 bytes from 173.194.122.16: icmp_seq=11 ttl=55 time=11.168 ms
64 bytes from 173.194.122.16: icmp_seq=12 ttl=55 time=14.034 ms
64 bytes from 173.194.122.16: icmp_seq=13 ttl=55 time=13.738 ms
64 bytes from 173.194.122.16: icmp_seq=14 ttl=55 time=13.672 ms
^C
--- www.google.com ping statistics ---
15 packets transmitted, 15 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 11.168/13.315/14.528/1.043 ms
pducks32 commented 9 years ago

And is it always builder?

pducks32 commented 9 years ago

The weird thing is that 0 jobs and 1 job act the exact same; both install sequentially. Let me know i you can think of anything else that would be helpful to know to reproduce problem.

deepj commented 9 years ago

No, it is always a random gem.

indirect commented 9 years ago

Maybe this has something to do with the Monitor that we use to synchronize building gems one at a time? :/

On Tue, May 19, 2015 at 11:15 AM, deepj notifications@github.com wrote:

No, it is always random gem.

Reply to this email directly or view it on GitHub: https://github.com/bundler/bundler/issues/3666#issuecomment-103620862

wpp commented 9 years ago

I've also experienced this behaviour. Although I'm using Bundler version 1.9.4. I documented the behaviour (with logs DEBUG_RESOLVER=y bundle install --verbose) in this stackoverflow question.

CPU Usage is 0 and network monitor also reports nothing is happening.

wpp commented 9 years ago

Hmm...I've tried different versions of Bundler:

and rubygems.

With different combinations of -j flag (0-4). They all "froze". By "freeze" I mean: 0% CPU usage, no network activity no stacktrace when ^C, happens with different Gems.

This seemed so weird that I had to try on a different machine.

On a VPS with CentOS Linux release 7.0.1406 (Core). bundle install completes every single time. So the behaviour I'm seeing is most likely not bundler related. I have a feeling that it might be my virtual box, boot2docker or the docker client in my case.

richbowen commented 9 years ago

@wpp, on the CentOS VPS did you try multiple ruby versions as well?

wpp commented 9 years ago

@rgb-one I've only used ruby 2.2.2 for all of these tests. I'll switch ruby version as well, try again and let you know. (ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-darwin14])

wpp commented 9 years ago

We figured out that our issue was definitely network-related. Running bundle install from a different network (and the CoreOS VPS I mentioned) worked. Trying to narrow the problem down:

while true; do
    curl -v https://aws-eu-cache01.rubygems.org
done

Basically ended up hanging in

* Rebuilt URL to: https://aws-eu-cache01.rubygems.org/
* Hostname was NOT found in DNS cache
*   Trying 54.216.164.178...
* Connected to aws-eu-cache01.rubygems.org (54.216.164.178) port 443 (#0)

every so often.

The TCP Dialog in these cases ended up looking like:

Wireshark TCP stream

The request that contains the servers certificate seems to be lost somewhere in the network. Which results in the [FIN, ACK] downstream. But I honestly don't understand how that retransmission situation is triggered (why does it keep retransmitting the [FIN, ACK]?)

richbowen commented 9 years ago

I tried to reproduce this using a virtualized environment but to no effect. bundle install -j 3/4 worked fine for me.

Note: I can only run 32-bit guests

deepj commented 9 years ago

I've been facing still this problem. I wouldn't say the problem is in my internet connection because I've tried several types networks (LTE, ADSL, cable, academic network, company network). The problem still occurs :(

pducks32 commented 9 years ago

I ✈️ back tonight and will take a look at it. I have some ideas.

wpp commented 9 years ago

Just FYI. Since Friday we can't reproduce the issue. Seems like bad router (if that was the problem) has been replaced/fixed.