instana / ruby-sensor

💎 Ruby Distributed Tracing & Metrics Sensor for Instana
https://www.instana.com/
MIT License
26 stars 25 forks source link

Support for JRuby #142

Closed Adithya-copart closed 5 years ago

Adithya-copart commented 5 years ago

Hello,

Are there any plans to support JRuby in the future?

I scanned the repository for any .c files and did not find any. I noticed that Oj uses native C-extensions apart from byebug and pry-byebug in development.

Is there a design limitation that would prevent the usage of this gem in JRuby?

I noticed 3 test failures while using stdlib json instead of oj. I'll try to dig deeper if there is nothing else that limits the usage in JRuby.

Adithya-copart commented 5 years ago

bin/rake test results in a docker container built using FROM jruby:9.2.7.0-jdk

Adithya-copart commented 5 years ago

I'm hitting the following error after removing the dependency on oj. Looks like the agent scans for the JRuby process name and fails.

2019-08-07T20:43:16.668+0000 | ERROR | na-http-thread-1 | Ruby | 131 - com.instana.discovery-ruby - 1.3.0 | Could not find the process RubyProcess . Found 4 possible candidates. 

Supporting JRuby in the agent will help us collect metrics efficiently. I'm happy to try out any workarounds to test in JRuby.

pglombardo commented 5 years ago

Hi @Adithya-copart,

Re: json vs. Oj - this should work fine as Oj is a drop in replacement that we used for faster json encoding.

58 tests, 12452 assertions, 0 failures, 0 errors, 0 skips

Cool :-)

2019-08-06 16:12:40 +0000: Listen loop error: #<IOError: closed stream>
org/jruby/RubyIO.java:3552:in `select'
/usr/local/bundle/gems/puma-4.0.1-java/lib/puma/server.rb:380:in `handle_servers'
/usr/local/bundle/gems/puma-4.0.1-java/lib/puma/server.rb:354:in `block in run'
2019-08-06 16:12:40 +0000: Listen loop error: #<IOError: closed stream>

The test suite runs some puma webservers in background threads - this appears to be related to and should be limited to the test suite.

2019-08-07T20:43:16.668+0000 | ERROR | na-http-thread-1 | Ruby | 131 - com.instana.discovery-ruby - 1.3.0 | Could not find the process RubyProcess . Found 4 possible candidates.

My guess is that it is unrelated to the gem change.

Since we haven't added official JRuby support, this hasn't been thoroughly tested. It's possible that that error is transient and eventually resolves on its own. Could you confirm this? Do your JRuby processes show up on the dashboard?

Adithya-copart commented 5 years ago

@pglombardo Unfortunately, this did not resolve on its own. https://github.com/instana/ruby-sensor/blob/694b4187ff3bcc3d31b12ef8bd1b75ee197b75d3/lib/instana/agent.rb#L216 It occurs during the above PUT call to the agent.

I could not find the relevant code that handles this request in any of the public repositories. I suspect it is private. The agent probably uses the process info in the HTTP call to match against the list of running processes in its host. We're sending the pid, name and arguments in the payload.

I explored changing the process name using setproctitle method which is not working in JRuby as expected. The other solution I came across is to use prctl via a LD_PRELOAD hook to rename the process.

I'm not sure if this approach works but it'll be easy if instana agent supports this jruby process lookup.

I used the following code within a irb session to replicate this error. I made sure that the INSTANA_AGENT_HOST is the appropriate agent address as I had trouble running this gem in my local and I don't have access to a instana-agent sandbox for testing.

INSTANA_DEBUG = 1
INSTANA_AGENT_HOST='' #The Host IP
::Instana.config[:agent_host] = INSTANA_AGENT_HOST
require 'instana/setup'
::Instana.logger.level = :debug

Thread.new do
  puts "Starting Instana tracing..."
  ::Instana.agent.start
end
pglombardo commented 5 years ago

I could not find the relevant code that handles this request in any of the public repositories. I suspect it is private. The agent probably uses the process info in the HTTP call to match against the list of running processes in its host. We're sending the pid, name and arguments in the payload.

This is correct - the Instana host agent searches the process table for a matching process. The complexity comes when there are multiple processes with the same command line and/or in-container pids (like pid 1). There are steps to properly identify the correct process but obviously this hasn't been tested in JRuby.

Is it possible that you could try this on Linux to confirm? There we use the /proc filesystem to identify processes on Linux and is more reliable than OSX (without /proc).

Adithya-copart commented 5 years ago

@pglombardo The above results are from a docker container using linux.

I was looking into the code and noticed that @is_linux = (RUBY_PLATFORM =~ /linux/i) ? true : false always returns false as RUBY_PLATFORM is java in this case which is being used here

I will update it to RbConfig::CONFIG['host_os'] == 'linux' and update the results.

Adithya-copart commented 5 years ago

Still see the same error in the agent.

This could be due to the number of arguments we pass, the message Found 3 possible candidates is right as we have 3 JRuby processes running with the last 3 arguments different.

EDIT: Nvm, I was using the cached gem. It worked with the file descriptor sharing. I see some errors related to ::GC::Profiler which is expected. I'll run this for a while and will update the results.

14:08:27 sidekiq.1 | D, [2019-08-08T14:08:27.756785 #33] DEBUG -- : Instana: Transitioning to announced
14:08:27 sidekiq.1 | I, [2019-08-08T14:08:27.761276 #33]  INFO -- : Instana: Host agent available. We're in business. (announced pid:33 /usr/local/openjdk-8/bin/java)
Adithya-copart commented 5 years ago

Updates: I'm running into issues with the POST call that is responsible to report the metrics in the collector. These are the keys in the payload:

[:memory, :thread, :sensorVersion, :ruby_version, :rpl, :framework, :versions, :pid, :fd, :inode, :name, :exec_args] 

https://github.com/instana/ruby-sensor/blob/052fc3dc0965605700a75adfd57243a1775cffa3/lib/instana/collector.rb#L47-L88

The POST returns Response:#<Net::HTTPNotFound:0x379ef89c>. I added additional process info like the fd and inode to see if it changes the response but it did not.

After leaving the process for a little while, the only logs I see are the following POST calls from the thread and memory collectors:

13:16:34 web.1     | D, [2019-08-09T13:16:34.225457 #30] DEBUG -- : Instana: POST->http://<host>:42699/com.instana.plugin.ruby.29760 body:({}) Response:#<Net::HTTPOK:0x4af6e504> body:([])

13:16:34 web.1     | D, [2019-08-09T13:16:34.228226 #30] DEBUG -- : Instana: POST->http://<host>:42699/com.instana.plugin.ruby.29760 body:({"memory":{}}) Response:#<Net::HTTPOK:0x67e72858> body:([])

Currently, I'm trying to see why Sinatra/Rack requests are not being reported.

pglombardo commented 5 years ago

Nice progress :-) The metrics will get return 404 responses until the host agent is fully ready to relay the metrics so the 404s are expected for a short period after successful announce.

Adithya-copart commented 5 years ago

Looks good so far. No instana related errors in the logs with INSTANA_DEBUG=1. I can see both puma and sidekiq processes listed within the docker container in our instana dashboard. The sidekiq job failures are visible as well as the HTTP traces.

Opened https://github.com/instana/ruby-sensor/pull/148

pglombardo commented 5 years ago

148 merged and released in gem 1.10.1 thanks @Adithya-copart!

https://github.com/instana/ruby-sensor/releases/tag/1.10.1

Adithya-copart commented 5 years ago

@pglombardo We tried to use the rubygems version of 1.10.1.

This piece of code was a bad idea due to https://github.com/rubygems/rubygems/issues/1662#issuecomment-232067212. https://github.com/instana/ruby-sensor/blob/22cc0e1c0ae77097166d7a8c7d1de86ac89378e9/instana.gemspec#L40

The two options I came across were:

  1. Use platform specific gem and tweak the release process. Ex: puma does this.
  2. Use the platform = in gemspec.

I came across an example of someone who ran into a similar problem:

PR with Conditional: https://github.com/jnormington/dotdiff/pull/16 PR using platform =: https://github.com/jnormington/dotdiff/pull/17

pglombardo commented 5 years ago

Ah that's right - I forgot about this bit. I need to build a local JRuby version with the platform flag set and push that separately to Rubygems... Give me a bit.

pglombardo commented 5 years ago

I added the platform flag to the gemspec, built pushed a JRuby version of the gem.

Now on jruby-9.2.7.0, gem install pulls the java version and no Oj gem :-)

Screen Shot 2019-08-22 at 11 52 37 AM

Could you give it a try? https://rubygems.org/gems/instana/versions/1.10.1-java

Adithya-copart commented 5 years ago

Everything is good now. Thanks Peter!