Closed purbon closed 8 years ago
interesting bit, breaking this
task "gems", [:bundle] do |task, args|
require "bootstrap/environment"
Rake::Task["dependency:rbx-stdlib"] if LogStash::Environment.ruby_engine == "rbx"
Rake::Task["dependency:bundler"].invoke
puts("Invoking bundler install...")
output, exception = LogStash::Bundler.invoke!(:install => true)
puts(output)
raise(exception) if exception
end # task gems
task "all" => "gems"
into two tasks, one for
Rake::Task["dependency:rbx-stdlib"] if LogStash::Environment.ruby_engine == "rbx"
Rake::Task["dependency:bundler"].invoke
and one for
puts("Invoking bundler install...")
output, exception = LogStash::Bundler.invoke!(:install => true)
puts(output)
raise(exception) if exception
seems to fix the issue, but this for now looks to me more like a workaround than a real fix.
@ph @colin as this is bundler related what do you think?
Alpha1 works on Fedora 23 for me --
⓿ localhost(~/build/logstash-5.0.0-alpha1)
% bin/logstash-plugin list --verbose | grep jdbc
logstash-input-jdbc (3.0.2)
⓿ localhost(~/build/logstash-5.0.0-alpha1)
% bin/logstash-plugin uninstall logstash-input-jdbc
Uninstalling logstash-input-jdbc
% head /etc/*release
==> /etc/fedora-release <==
Fedora release 23 (Twenty Three)
A mostly-stock fresh-ish fedora 23 w/ logstash 5.0.0-alpha1 also works for me (previous comment was done on my workstation, did another test on a cleaner install)
Funky. I will keep trying to reproduce it. I trust this failure as being a real failure.
Also I had success with the elastic/fedora-23-x86_64
vagrant box and the following flow of commands as vagrant user issued right after vagrant up
:
sudo dnf install -y java-1.8.0-openjdk-devel.x86_64
git clone git://github.com/sstephenson/rbenv.git .rbenv
echo 'export PATH="$HOME/.rbenv/bin:$PATH"' >> ~/.bash_profile
echo 'eval "$(rbenv init -)"' >> ~/.bash_profile
git clone git://github.com/sstephenson/ruby-build.git ~/.rbenv/plugins/ruby-build
source ~/.bash_profile
git clone https://github.com/elastic/logstash.git
cd logstash/
rbenv install jruby-1.7.25
rbenv local jruby-1.7.25
rake bootstrap
also ci/ci_test.sh
worked.
interesting bit, see https://logstash-ci.elastic.co/job/elastic+logstash+master+multijob-os-compatibility/46/os=fedora/console
Building remotely on slave-09dfb3d1 (fedora-23 virtual swarm fedora linux) in workspace /var/lib/jenkins/workspace/elastic+logstash+master+multijob-os-compatibility/os/fedora
it looks like is fedora 23, @elasticdog any idea what might be the differences between them ?
Is known that reverting https://github.com/elastic/logstash/pull/5134/commits/5de3ce40d1c2f12395ecee4685ca98fa7b206032, so going to an earlier version of jruby-openssl and/or jruby makes the issue goes away, so most probably this is all about having wrong ca certificates somewhere or a buggy openssl version.
Last update on this:
[vagrant@localhost ~]$ sudo /opt/logstash/bin/logstash-plugin install logstash-input-jdbc
Validating logstash-input-jdbc
Installing logstash-input-jdbc
Installation successful
for me this start to looks like mostly a bootstrap problem, and not that much as including the plugin manager, so we might be safe not to rollback, would be nice to get a confirmation by another pair of eyes here.
@ph willing to take it?
@purbon I can take an eye on that, thanks for keep trace of all the debug <3
for me this start to looks like mostly a bootstrap problem, and not that much as including the plugin manager, so we might be safe not to rollback, would be nice to get a confirmation by another pair of eyes here.
This look like the problem here, I remember that a few eons ago that we have a problem doing bootstrap + installing other gem in the same rake task. IIRC it might have been related to bundler available somewhere in the system that was messing up the build process.
But in the described case it should not be the issue since I don't believe rbenv do a gem install bundler after installing a specific ruby version.
I can reproduce the bugs and I can confirm that installing jruby with rbenv doesnt install a local bundler.
OKAY, I have made more breakthrough.
To get more useful information during the bundle install
I have changed the lib/bootstrap/bundler.rb
file.
diff --git a/lib/bootstrap/bundler.rb b/lib/bootstrap/bundler.rb
index 2948fe8..65e404b 100644
--- a/lib/bootstrap/bundler.rb
+++ b/lib/bootstrap/bundler.rb
@@ -106,36 +106,9 @@ module LogStash
try = 0
# capture_stdout also traps any raised exception and pass them back as the function return [output, exception]
- output, exception = capture_stdout do
- loop do
- begin
- ::Bundler.reset!
- ::Bundler::CLI.start(bundler_arguments(options))
- break
- rescue ::Bundler::VersionConflict => e
- $stderr.puts("Plugin version conflict, aborting")
- raise(e)
- rescue ::Bundler::GemNotFound => e
- $stderr.puts("Plugin not found, aborting")
- raise(e)
- rescue => e
- if try >= options[:max_tries]
- $stderr.puts("Too many retries, aborting, caused by #{e.class}")
- $stderr.puts(e.message) if ENV["DEBUG"]
- raise(e)
- end
-
- try += 1
- $stderr.puts("Error #{e.class}, retrying #{try}/#{options[:max_tries]}")
- $stderr.puts(e.message)
- sleep(0.5)
- end
- end
- end
+ ::Bundler.reset!
+ ::Bundler::CLI.start(bundler_arguments(options))
- raise exception if exception
-
- return output
end
# build Bundler::CLI.start arguments array from the given options hash
@@ -162,6 +135,8 @@ module LogStash
arguments << "--all" if options[:all]
end
+ arguments << "--verbose"
+
arguments.flatten
end
Now If I run ci/ci_test.s
I get this useful trace.
HTTP GET https://bundler.rubygems.org/api/v1/dependencies?gems=faraday-middleware
HTTP 200 OK
Query List: []
Resolving dependencies.....
Installing addressable 2.3.8
0: addressable (2.3.8) from /home/vagrant/logstash/vendor/bundle/jruby/1.9/specifications/addressable-2.3.8.gemspec
Java::JavaLang::OutOfMemoryError: GC overhead limit exceeded
org.bouncycastle.asn1.ASN1Set.toDERObject(Unknown Source)
org.bouncycastle.asn1.DEROutputStream.writeObject(Unknown Source)
org.bouncycastle.asn1.DERSequence.encode(Unknown Source)
org.bouncycastle.asn1.ASN1OutputStream.writeObject(Unknown Source)
org.bouncycastle.jcajce.provider.asymmetric.x509.X509CertificateObject.getSubjectX500Principal(Unknown Source)
org.jruby.ext.openssl.x509store.X509AuxCertificate.getSubjectX500Principal(X509AuxCertificate.java:206)
org.jruby.ext.openssl.x509store.Certificate.matches(Certificate.java:56)
org.jruby.ext.openssl.x509store.Store.matchedObject(Store.java:315)
org.jruby.ext.openssl.x509store.Store.addCertificate(Store.java:288)
org.jruby.ext.openssl.x509store.Lookup.loadCertificateOrCRLFile(Lookup.java:331)
org.jruby.ext.openssl.x509store.Lookup$ByFile.call(Lookup.java:525)
org.jruby.ext.openssl.x509store.Lookup$ByFile.call(Lookup.java:508)
org.jruby.ext.openssl.x509store.Lookup.control(Lookup.java:154)
good debug here @ph, thanks for helping out here.
Okay, not sure what is going on in the jruby-openssl land, but I do know that they have updated their dependency on bouncy castle and changed how they deal with the local store.
but if it's openssl, then going ssl work with the same jruby version should break also, isn't?
@purbon well the errors is raised from openssl, but it could be a leak from somewhere else.
The stacktrace indicate that the code is trying to load multiple certificates and OOM on it.
I know that the error is raised from there, but tried to narrow down myself and failed to a pure SSL issue, but I guess you will have more success 👍 happy hunting!
Another interesting question is this happen if we execute the way we do, but not if we break the tasks as I noted in https://github.com/elastic/logstash/issues/5179#issuecomment-213517628
@purbon breaking it into multiple call make it friendlier to the memory and we don't get a OOM.
makes sense, good work here man!
OK, one thing that just strike me is we I don't believe we have this bug when using the logstash' shell command.. When we use rake task it actually uses the default settings of JRuby and launch the jvm with a default heap size of 500megs. As a comparison we use 1gig for logstash.
Okay, using export export JRUBY_OPTS=-J-Xmx1024m
make the process complete without any problem. The update of jruby
or jruby-openssl might have increase the memory consumption slighty. We may have been on the edge of going oom from some time.
I recommend we use the same memory default for running rake
command than our shell script use.
Hi, running the rake test:core, previous bootstrap, in fedora (also in other linuxes) break with latest master. see https://gist.github.com/purbon/495be8d1dbe735ae9c7e95bcb6702c04 for details. This might be due to some issue with https and certificates as running the
rake bootstrap
with the Gemfile pointed tohttp://rubygems.org
works without problems.Also using jruby .23 (latest used in logstash) fixes the issue, this makes me belive is some incompatiblity with the new introduced jruby-openssl version, see https://github.com/elastic/logstash/commit/5de3ce40d1c2f12395ecee4685ca98fa7b206032 that is where this problem started on https://logstash-ci.elastic.co/job/elastic-logstash-master/85/
testing now other implication here as using the new packaged on this platforms and try to install plugins.
update
plugin manager is also broken for fedora and might be for other linux, see
This errors happens in all redhat based distros we test with:
other distributions are healthy for issue.
5.0 will for sure have the same issue.