newrelic / newrelic-ruby-agent

New Relic RPM Ruby Agent
https://docs.newrelic.com/docs/apm/agents/ruby-agent/getting-started/introduction-new-relic-ruby/
Apache License 2.0
1.2k stars 599 forks source link

Version 9.2.2 crashes rails app on load: LoadError: cannot load such file -- fiber #2001

Closed antonmorant closed 1 year ago

antonmorant commented 1 year ago

Description

I recently did a major upgrade on a rails app that is hosted with heroku and uses the NewRelic. This included upgrading the heroku stack, ruby version, rails version, and many gem & code updates to get everything back to working as usual (see environment details further down). When deploying, it updated to the latest version of the newrelic_rpm gem (there's no version requirement in Gemfile). This triggered an error that crashes the app on load: LoadError: cannot load such file -- fiber.

The app has 2 processes, a web server (using puma) and a worker for background jobs (using sidekiq). Both processes crash on startup.

After debugging and trying different things, I finally linked the crash to the newrelic_rpm gem update. I noticed in the changelog that v9.0.0 added support for Fiber instrumentation, so I tried downgrading the gem. Pegging the version to < 9.0 installs version 8.16.0, which fixes the crash (or, rather, works around it).

Expected Behavior

Rails app loads without crashing using newrelic_rpm v9.2.2.

Troubleshooting or NR Diag results

Crash stack trace:

heroku[web.1]: Starting process with command `bundle exec puma -p 11167`
app[web.1]: [2] Puma starting in cluster mode...
app[web.1]: [2] * Puma version: 6.2.2 (ruby 3.2.2-p53) ("Speaking of Now")
app[web.1]: [2] *  Min threads: 5
app[web.1]: [2] *  Max threads: 5
app[web.1]: [2] *  Environment: production
app[web.1]: [2] *   Master PID: 2
app[web.1]: [2] *      Workers: 2
app[web.1]: [2] *     Restarts: (✔) hot (✖) phased
app[web.1]: [2] * Preloading application
app[web.1]: W, [2023-05-05T13:47:14.656660 #2]  WARN -- sentry: ** [Raven] You are running on Heroku but haven't enabled Dyno Metadata. For Sentry's release detection to work correctly, please run `heroku labs:enable runtime-dyno-metadata`
app[web.1]: [2] ! Unable to load application: LoadError: cannot load such file -- fiber
app[web.1]: bundler: failed to load command: puma (/app/vendor/bundle/ruby/3.2.0/bin/puma)
app[web.1]: <internal:/app/vendor/ruby-3.2.2/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:37:in `require': cannot load such file -- fiber (LoadError)
app[web.1]: from <internal:/app/vendor/ruby-3.2.2/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:37:in `require'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/zeitwerk-2.6.8/lib/zeitwerk/kernel.rb:38:in `require'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/actionview-7.0.4.3/lib/action_view/renderer/streaming_template_renderer.rb:3:in `<top (required)>'
app[web.1]: from <internal:/app/vendor/ruby-3.2.2/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:37:in `require'
app[web.1]: from <internal:/app/vendor/ruby-3.2.2/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:37:in `require'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/zeitwerk-2.6.8/lib/zeitwerk/kernel.rb:38:in `require'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/sentry-raven-2.7.4/lib/raven/integrations/rails.rb:32:in `block (2 levels) in <class:Rails>'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/activesupport-7.0.4.3/lib/active_support/lazy_load_hooks.rb:95:in `class_eval'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/activesupport-7.0.4.3/lib/active_support/lazy_load_hooks.rb:95:in `block in execute_hook'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/activesupport-7.0.4.3/lib/active_support/lazy_load_hooks.rb:85:in `with_execution_control'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/activesupport-7.0.4.3/lib/active_support/lazy_load_hooks.rb:90:in `execute_hook'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/activesupport-7.0.4.3/lib/active_support/lazy_load_hooks.rb:76:in `block in run_load_hooks'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/activesupport-7.0.4.3/lib/active_support/lazy_load_hooks.rb:75:in `each'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/activesupport-7.0.4.3/lib/active_support/lazy_load_hooks.rb:75:in `run_load_hooks'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/actionview-7.0.4.3/lib/action_view/base.rb:276:in `<class:Base>'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/actionview-7.0.4.3/lib/action_view/base.rb:141:in `<module:ActionView>'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/actionview-7.0.4.3/lib/action_view/base.rb:12:in `<top (required)>'
app[web.1]: from <internal:/app/vendor/ruby-3.2.2/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:37:in `require'
app[web.1]: from <internal:/app/vendor/ruby-3.2.2/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:37:in `require'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/zeitwerk-2.6.8/lib/zeitwerk/kernel.rb:38:in `require'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/activesupport-7.0.4.3/lib/active_support/dependencies/autoload.rb:72:in `block in eager_load!'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/activesupport-7.0.4.3/lib/active_support/dependencies/autoload.rb:72:in `each_value'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/activesupport-7.0.4.3/lib/active_support/dependencies/autoload.rb:72:in `eager_load!'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/actionview-7.0.4.3/lib/action_view.rb:91:in `eager_load!'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/railties-7.0.4.3/lib/rails/application/finisher.rb:75:in `each'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/railties-7.0.4.3/lib/rails/application/finisher.rb:75:in `block in <module:Finisher>'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/railties-7.0.4.3/lib/rails/initializable.rb:32:in `instance_exec'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/railties-7.0.4.3/lib/rails/initializable.rb:32:in `run'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/railties-7.0.4.3/lib/rails/initializable.rb:61:in `block in run_initializers'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/tsort.rb:228:in `block in tsort_each'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/tsort.rb:350:in `block (2 levels) in each_strongly_connected_component'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/tsort.rb:431:in `each_strongly_connected_component_from'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/tsort.rb:349:in `block in each_strongly_connected_component'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/tsort.rb:347:in `each'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/tsort.rb:347:in `call'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/tsort.rb:347:in `each_strongly_connected_component'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/tsort.rb:226:in `tsort_each'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/tsort.rb:205:in `tsort_each'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/railties-7.0.4.3/lib/rails/initializable.rb:60:in `run_initializers'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/railties-7.0.4.3/lib/rails/application.rb:372:in `initialize!'
app[web.1]: from /app/config/environment.rb:5:in `<top (required)>'
app[web.1]: from <internal:/app/vendor/ruby-3.2.2/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:37:in `require'
app[web.1]: from <internal:/app/vendor/ruby-3.2.2/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:37:in `require'
app[web.1]: from config.ru:3:in `block in <main>'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/rack-2.2.7/lib/rack/builder.rb:116:in `eval'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/rack-2.2.7/lib/rack/builder.rb:116:in `new_from_string'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/rack-2.2.7/lib/rack/builder.rb:105:in `load_file'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/rack-2.2.7/lib/rack/builder.rb:66:in `parse_file'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/puma-6.2.2/lib/puma/configuration.rb:366:in `load_rackup'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/puma-6.2.2/lib/puma/configuration.rb:288:in `app'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/puma-6.2.2/lib/puma/runner.rb:158:in `load_and_bind'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/puma-6.2.2/lib/puma/cluster.rb:359:in `run'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/puma-6.2.2/lib/puma/launcher.rb:194:in `run'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/puma-6.2.2/lib/puma/cli.rb:75:in `run'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/gems/puma-6.2.2/bin/puma:10:in `<top (required)>'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/bin/puma:25:in `load'
app[web.1]: from /app/vendor/bundle/ruby/3.2.0/bin/puma:25:in `<top (required)>'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/bundler/cli/exec.rb:58:in `load'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/bundler/cli/exec.rb:58:in `kernel_load'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/bundler/cli/exec.rb:23:in `run'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/bundler/cli.rb:492:in `exec'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/command.rb:27:in `run'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/invocation.rb:127:in `invoke_command'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor.rb:392:in `dispatch'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/bundler/cli.rb:34:in `dispatch'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/base.rb:485:in `start'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/bundler/cli.rb:28:in `start'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/gems/3.2.0/gems/bundler-2.4.10/libexec/bundle:45:in `block in <top (required)>'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/3.2.0/bundler/friendly_errors.rb:117:in `with_friendly_errors'
app[web.1]: from /app/vendor/ruby-3.2.2/lib/ruby/gems/3.2.0/gems/bundler-2.4.10/libexec/bundle:33:in `<top (required)>'
app[web.1]: from /app/bin/bundle:3:in `load'
app[web.1]: from /app/bin/bundle:3:in `<main>'
heroku[web.1]: Process exited with status 1
heroku[web.1]: State changed from starting to crashed

Steps to Reproduce

This issue might be specific to my app environment/config/code, so I'll just list what seems relevant noting that I have not attempted to reproduce this on a fresh new app:

Your Environment

Additional context

https://stackoverflow.com/questions/76201816/rails-deploy-fails-with-unable-to-load-application-loaderror-cannot-load-such/76201987

For Maintainers Only or Hero Triaging this bug

Suggested Priority (P1,P2,P3,P4,P5): Suggested T-Shirt size (S, M, L, XL, Unknown):

workato-integration[bot] commented 1 year ago

https://issues.newrelic.com/browse/NEWRELIC-8501

kaylareopelle commented 1 year ago

Hi @antonmorant, thanks for your detailed bug report and sleuthing thus far. I'm glad to hear downgrading your agent stops the error in your environment.

I'm taking a look at this issue and will get back to you soon with more details.

kaylareopelle commented 1 year ago

Hi @antonmorant, I followed your repro steps using an environment matching your list and couldn't reproduce the bug. Unfortunately, I think the problem may be more specific to your environment.

We use require 'fiber' to bring in the first-party Ruby Fiber library, as it isn't automatically made available in all Ruby versions and implementations. Our current hypothesis is that while the require is erroring out for you, you still actually have access to the Fiber library in your application.

To test that theory, please update the newrelic_rpm line in your Gemfile to the following:

gem 'newrelic_rpm', git: https://github.com/newrelic/newrelic-ruby-agent, branch: 'bugfix/fiber-load-error'

If that theory is wrong and your application really doesn't have access to the Fiber class, then we will see a different error to that effect.

antonmorant commented 1 year ago

Thank you for looking into this! I'm sorry for the long turnaround. I'll give this a shot and report back, you may expect to hear from me next week.

antonmorant commented 1 year ago

Hi @kaylareopelle, I have done a deploy with the newrelic gem pinned to bugfix/fiber-load-error as you requested and I'm afraid the result was the same. I was going to paste the logs here, but I checked and the error message and stack trace are actually identical, line by line.

fallwith commented 1 year ago

Hi @antonmorant! I work with @kaylareopelle and took a closer look at this one.

Here is what I currently understand:

At this point with the New Relic code itself not appearing in the stack trace, I don't have confidence that further changes made in a bugfix branch will actually be involved at the time of the error and have an impact on its behavior. I think the focus should be on determining why your app / tech stack errors out when encountering a require "fiber" call.

Are you able to obtain an interactive Irb session with your tech stack or perhaps a worker job that simply performs require "fiber" and nothing else? If the require call succeeds in one of those simple contexts but fails when your Rails app boots, then we at least have a finite list of differences to go by in troubleshooting the problem.

If you can produce a reproduction that demonstrates the problem and share it with us via a GitHub gist or repository, we would be happy to try to help troubleshoot. Please do not share any proprietary business logic or sensitive data.

I hope this gives you some ideas on next steps to try. We're happy to answer any questions you may have or being a sounding board for any ideas you'd like to share.

antonmorant commented 1 year ago

Hi @fallwith,

Thanks for looking into this and offering all the extra info. Very educational!

Everything you said above checks out and/or matches my experience and understanding of the issue. In particular, let me zoom into this one bit:

But upon inspecting your stack trace... I can't actually see any references to New Relic code. I see Bundler, RubyGems, Puma, Rails, and Sentry, but no mention of New Relic.

That is correct. The only data I have pointing to an issue with the newrelic gem is that 1) removing it from the Gemfile removes the error, and adding it back in resurfaces it again; and 2) switching versions on the newrelic gem also removes/resurfaces the error. I wonder if there is some undesired interaction where depending on the load order of libraries it results in an error or not.

Regarding your requests/suggestions:

Are you able to obtain an interactive Irb session with your tech stack or perhaps a worker job that simply performs require "fiber" and nothing else? If the require call succeeds in one of those simple contexts but fails when your Rails app boots, then we at least have a finite list of differences to go by in troubleshooting the problem.

Yes, I can log into the heroku dyno. I have tested both logging into a rails console as well as opening a shell session and then starting irb. In both cases, the instruction require "fiber" succeeds without an error and returns false.

If you can produce a reproduction that demonstrates the problem and share it with us via a GitHub gist or repository, we would be happy to try to help troubleshoot. Please do not share any proprietary business logic or sensitive data.

That's a great idea. It may take me a bit of time to put it together but I might be able to give it a shot.

If you happen to see that new log message in your log/newrelic_agent.log file, the message may possibly include some helpful information.

I didn't realize to look for that file when I was testing the bugfix branch, and because I have redeployed since it seems the file is nowhere to be found on the server machine. But it's good to know.

hannahramadan commented 1 year ago

Hi @antonmorant. I work with @kaylareopelle and @fallwith - we're all keeping tabs on this issue.

While the stack trace points to the error when the Rails code is reached, we agree there seems to be an unintended interaction between the newrelic and rails gems. We can't dig in much further without a reproduction (I've been unable to produce one yet), but I'd love to explore your theory about load order.

We have the configuration option to defer agent initialization until after rails initializers are run. Are you able to set the following as an environment variable using the same environment you're seeing the issue in?

NEW_RELIC_DEFER_RAILS_INITIALIZATION=true

We also want to confirm that when you tested with a Rails console and with irb that you were using the same problematic Gemfile that has v9 of our newrelic_rpm gem and Rails both present. Meaning, did you run bundle exec irb and bundle exec rails c or otherwise perform manual require statements of the New Relic and Rails gems to test?

kaylareopelle commented 1 year ago

Hi @antonmorant, we haven't heard from you for a while so we're going to close this issue. Feel free to reopen when you have more information to share.