scientist-softserv / utk-hyku

Other
6 stars 0 forks source link

⚙️ Add auto-cleanup for GoodJob #574

Closed jeremyf closed 11 months ago

jeremyf commented 11 months ago

This commit introduces automatic clean-up of finished successful jobs. It is here to help with the overall performance of GoodJobs as well as the application database. Note the cleanup schedule is a guess on what's appropriate.

Code Walk Through of Good Jobs Configuration The [README](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/README.md?plain=1#L280) explains the configuration for the Scheduler. ```markdown - `cleanup_interval_seconds` (integer) Number of seconds a Scheduler will wait before cleaning up preserved jobs. Defaults to `nil`. Can also be set with the environment variable `GOOD_JOB_CLEANUP_INTERVAL_SECONDS`. ``` Here's the implementation of [GoodJob::Configuration](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/lib/good_job/configuration.rb#L211-L220) regarding the cleanup_interval_seconds. Which is in the `GoodJob::CleanupTracker`. ```ruby def cleanup_interval_seconds value = ( rails_config[:cleanup_interval_seconds] || env['GOOD_JOB_CLEANUP_INTERVAL_SECONDS'] || DEFAULT_CLEANUP_INTERVAL_SECONDS ) value.present? ? value.to_i : nil end ``` The [GoodJob::CleanupTracker](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/lib/good_job/cleanup_tracker.rb#L23-L29) has a `#cleanup?` method that looks at either job counts or elapsed seconds. Which informs the `GoodJob::Scheduler`. ```ruby def cleanup? (cleanup_interval_jobs && job_count > cleanup_interval_jobs) || (cleanup_interval_seconds && last_at < Time.current - cleanup_interval_seconds) || false end ``` The [GoodJob::Scheduler](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/lib/good_job/scheduler.rb#L180-L193) observes the tasks as they complete. And one of those is conditionally running `#cleanup`. ```ruby def task_observer(time, output, thread_error) error = thread_error || (output.is_a?(GoodJob::ExecutionResult) ? output.unhandled_error : nil) GoodJob._on_thread_error(error) if error instrument("finished_job_task", { result: output, error: thread_error, time: time }) return unless output @cleanup_tracker.increment if @cleanup_tracker.cleanup? cleanup else create_task end end ``` The [GoodJob::Scheduler](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/lib/good_job/scheduler.rb#L233-L250)'s `#cleanup` method delegates the clean_up to the performer; which is a `GoodJob::JobPerformer`. ```ruby def cleanup @cleanup_tracker.reset future = Concurrent::Future.new(args: [self, @performer], executor: executor) do |_thr_scheduler, thr_performer| Rails.application.executor.wrap do thr_performer.cleanup end end observer = lambda do |_time, _output, thread_error| GoodJob._on_thread_error(thread_error) if thread_error create_task end future.add_observer(observer, :call) future.execute end ``` The [GoodJob::JobPerformer](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/lib/good_job/job_performer.rb#L60-L64) then runs the general process `GoodJob.cleanup_preserved_jobs` (which is available via the CLI). ```ruby def cleanup GoodJob.cleanup_preserved_jobs end ``` The [GoodJob.cleanup_preserved_jobs](https://github.com/bensheldon/good_job/blob/11b05e525d6cc0d4023b8b8b6b9824c40503b712/lib/good_job.rb#L130-L153) method is the one that ultimately cleans up preserved jobs. Note that the `include_discarded` does some logical hoops with some grammatical antics (e.g. `old_jobs.not_discarded unless include_discarded`). We are not including discarded jobs so the query will limit to jobs that are not_discarded. ```ruby def self.cleanup_preserved_jobs(older_than: nil) configuration = GoodJob::Configuration.new({}) older_than ||= configuration.cleanup_preserved_jobs_before_seconds_ago timestamp = Time.current - older_than include_discarded = configuration.cleanup_discarded_jobs? ActiveSupport::Notifications.instrument("cleanup_preserved_jobs.good_job", { older_than: older_than, timestamp: timestamp }) do |payload| old_jobs = GoodJob::ActiveJobJob.where('finished_at <= ?', timestamp) old_jobs = old_jobs.not_discarded unless include_discarded old_jobs_count = old_jobs.count GoodJob::Execution.where(job: old_jobs).destroy_all payload[:destroyed_records_count] = old_jobs_count end end ```

Related to:

Story

Refs #issuenumber

Expected Behavior Before Changes

Expected Behavior After Changes

Screenshots / Video

Notes