getsentry / sentry-ruby

Sentry SDK for Ruby
https://sentry.io/for/ruby
MIT License
927 stars 493 forks source link

Profiling Rake Tasks launched with Cron #2238

Open ishields opened 7 months ago

ishields commented 7 months ago

Environment

SaaS (https://sentry.io/)

What are you trying to accomplish?

I am trying to profile Ruby/Rails rake tasks that run with Cron so that I can have insight into their performance (slow queries, n+1 issues, etc) like I do with normal rails transactions. I use a cron file run rake tasks. Ex:

0 0,2,4,6,8,10,12,14,16,18,20,22 * * * /bin/bash -l -c 'cd /home/deploy/projectName/releases/20240... && RAILS_ENV=production RAILS_ENV=production bundle exec rake environment myraketasks:hello_workd --silent >> log/cron.log 2>> log/cron-error.log'

It seems that even if I put profile sampling to 100%, these rake tasks are never picked up and I have no visibility into how they're performing.

I've also tried manually starting a transaction within the rake task. This seems like it would be the right thing to do however database queries and profiling data is still not logged.

desc 'This is a sample rake task'
  task :hello_world, [:force_staging, :mode] => :environment do |t, args|
    transaction = Sentry.start_transaction(op: "hello_world")
    ...
    do_work()
    ...
    transaction.finish
  end

How are you getting stuck?

I'm not sure how to enable profiling for these rake tasks so that I can monitor their performance.

Where in the product are you?

Performance

Link

No response

DSN

No response

Version

No response

getsantry[bot] commented 7 months ago

Assigning to @getsentry/support for routing ⏲️

getsantry[bot] commented 7 months ago

Routing to @getsentry/product-owners-performance-queries for triage ⏲️

k-fish commented 7 months ago

Hey there @ishields, it sounds like you might not have integrations setup correctly if you are seeing transactions but not database queries etc. which should be working if you are making database queries through ActiveRecord.

Double check that you have the correct gems installed (see automatic instrumentation in the docs)

If you're still having trouble you can post your config / init setup, or otherwise try asking for more ruby specific help in our https://github.com/getsentry/sentry-ruby sdk repo.

ishields commented 7 months ago

Thanks for your response. I think I have the right integrations setup because in the main rails app and in sidekiq everything behaves correctly. It's really just the rake tasks (launched via cron) where this is an issue.

Here is my configuration:

Sentry.init do |config|
    puts 'Initializing Sentry'

    config.dsn = '.....'
    config.breadcrumbs_logger = %i[active_support_logger http_logger]
    config.enable_tracing = true

    # Set traces_sample_rate to 1.0 to capture 100%
    # of transactions for performance monitoring.
    # We recommend adjusting this value in production.
    config.traces_sample_rate = 1
    config.profiles_sample_rate = 1
    config.sample_rate = 1

    config.enabled_environments = %w[development production]
ScheduleJobsWorker].map(&:downcase)
    config.traces_sampler =
      lambda do |context|
        LogUtil.info("Sentry Trace #{context}")
        return true
      end
  end
# sentry
gem 'stackprof' #use for sentry
gem "sentry-ruby"
gem "sentry-rails"
gem "sentry-sidekiq"

I put some print statements in my config below - when running the rake tasks it prints 'Initializing Sentry' but doesn't print "Sentry Trace #{context}". The only way I managed to get it to print Sentry Trace is by manually creating a transaction as shown in my original example.

k-fish commented 7 months ago

I think I have the right integrations setup because in the main rails app and in sidekiq everything behaves correctly

Just to clarify, you are seeing database spans etc. in regular rails usage, just not rake?

It sounds like it might be an sdk specific issue, I'm going to tag the sdk team to help you look further.

ishields commented 7 months ago

Yep, that's correct I see database spans etc in regular rails usage (the webserver and the sidekiq tasks). This issue is only happening for me with rake tasks. Thanks!

sl0thentr0py commented 7 months ago

@ishields you will need to start the transaction manually in a rake task (we do that for request/response cycles automatically in rails but otherwise not).

However, once started the database spans should be there, I'll look into that.

ishields commented 7 months ago

I see, thanks for letting me know. Yea unfortunately when I start the transaction manually the database connections do not come in and I get very little details on the transaction. I'm looking to dig into the transaction stack to view performance issues like I do with other rails web app or sidekiq transactions. Also would you happen to have any nice tricks for adding the start transaction code to all rake tasks? It'd be awesome if there was a hook or patch that could be added which would do this for all of them automatically. Also, once we figure the issue with the db queries out maybe I can help add documentation for this problem as I've spent quite a while trying to figure this out thinking it was an issue on my end. Happy to help add this documentation.

sl0thentr0py commented 7 months ago

yes you can just patch/prepend Rake::Application#execute and start the transaction, like this https://github.com/getsentry/sentry-ruby/blob/master/sentry-ruby/lib/sentry/rake.rb

ishields commented 7 months ago

@sl0thentr0py Thanks. I've added a patch to my app. Will monitor on production and see what data it collects / whether it starts pulling in any database queries or performance data. Just curious if there is a reason this isn't default behavior?

Here is what the patch looks like:

# This patch will start a transaction on every rake task so we can collect data about it.
require 'rake'
require 'rake/task'

module Sentry
  module Rake
    module Task
      def execute(args = nil)
        if Sentry.initialized? && !Sentry.configuration.skip_rake_integration
          transaction = Sentry.start_transaction(name: "rake #{name}", op: 'rake')
        end

        super
      ensure
        transaction&.finish
      end
    end
  end
end

# @api private
module Rake
  class Task
    prepend(Sentry::Rake::Task)
  end
end
sl0thentr0py commented 7 months ago

we could add it, the only place I can see where it wouldn't work is for really long running tasks and the transaction would be dropped by our ingestion service.

@st0012 do you think we should auto start transactions for every rake task?

st0012 commented 7 months ago

My main concern is around the variety of rake tasks that are out there, and how the feature could affect some of them. For example, with this feature all rake tasks will start sending network requests to Sentry, which may cause side-effect to some tasks?

I wonder if we can support it at a per-task level, like providing a sentry_monitored_task DSL:

sentry_monitored_task :hello_world, [:force_staging, :mode] => :environment do |t, args|
  # ...
end 

And then we do one of these depending on which is easier:

sl0thentr0py commented 7 months ago

yep let's go with an optional patch now that we have config.enabled_patches anyway.

ishields commented 7 months ago

Thanks, I think this makes a lot of sense. Too many rake tasks out there to just force opt this in. I'm still not sure if db queries are being logged. I can see them in the rake task trace when digging in but haven't confirmed they show up in performance metrics.

st0012 commented 7 months ago

yep let's go with an optional patch now that we have config.enabled_patches anyway.

It's not uncommon for a decent-sized project to have dozens or more of rake tasks, with various running frequencies and importance. And personally I'd only opt-in to this feature for a few selected tasks to reduce noise and cost. Therefore, I think it's better to create a wrapper method too in addition to the optional patch config.

solnic commented 1 month ago

Another option is to decorate tasks with Sentry based on configuration. For example we could have config.monitored_rake_tasks = ["db:migrate", "my_app:*"] and then go through this and do Rake::Task[name_of_task].extend(Sentry::Rake::Task).

This way it would be following the same approach as everything else, whereas with the sentry_monitored_task we'd be adding a special way of defining rake tasks, which could be considered as too obtrusive.