scientist-softserv / scholarworks

Cal State Hyrax
0 stars 0 forks source link

Get Fixity checks running #8

Open crisr15 opened 1 year ago

crisr15 commented 1 year ago

Story

Links to: https://github.com/csuscholarworks/scholarworks/issues/58

Acceptance Criteria

Example diff ```git diff --git a/lib/tasks/file_fixity_check.rake b/lib/tasks/file_fixity_check.rake index a557369..87d272e 100644 --- a/lib/tasks/file_fixity_check.rake +++ b/lib/tasks/file_fixity_check.rake @@ -6,28 +6,21 @@ # # this method is adapted from Hyrax::FileSetFixityCheckService -def file_set_fixity_check(file_set_id) - files = FileSet.find(file_set_id).files - - files.each do |file| - versions = file.has_versions? ? file.versions.all : [file] - versions.collect do |v| - FixityCheckJob.perform_now(v.uri.to_s, file_set_id: file_set_id, file_id: file.id) - end.flatten - end +def file_set_fixity_check(file_set) + Hyrax::FileSetFixityCheckService.new(file_set, async_jobs: false).fixity_check puts '**********************************************' - puts "COMPLETED THE FIXITY CHECK FOR FILESET: #{file_set_id}" + puts "COMPLETED THE FIXITY CHECK FOR FILESET: #{file_set.id}" end namespace :calstate do desc 'Run a fixity check on a single file' task :file_fixity_check, [:file_set_id] => [:environment] do |_t, args| - file_set_fixity_check(args[:file_set_id]) + file_set_fixity_check(FileSet.find(args[:file_set_id])) end desc 'Run a fixity check on all files' task all_files_fixity_check: :environment do - FileSet.find_each { |file_set| file_set_fixity_check(file_set.id) } + FileSet.find_each { |file_set| file_set_fixity_check(file_set) } end end ```

1 cron tasks run in a limited environment and don't have access to the "normal" $PATH, thus the full path is required. Without it, it throws errors like this

2 Currently, the rake task is configured to run on every version of every file every day. This is not desired behavior. The Hyrax::FileSetFixityCheckService has logic built in to wait X amount of time before checking the same file again

Testing Instructions

  1. Deploy changes to production server
  2. The day after the deploy, ssh into the production server (e.g. hyrax_0)
    • cron task automatically runs overnight
  3. Look at the /var/log/hyrax/fixitycheck.log file to see if the task ran successfully
    • Each run is timestamped with YYYY-MM-DD for your grepping convenience
crisr15 commented 1 year ago

Lea Ann: Customer ran fixity check a long time ago and it had errors.

Working on getting the code running where I can create a fileset locally to run the task.

Rob suspects this is an issue with production fedora setup, but we need to get this running locally to see if it's an application error or an issue with the configuration on prod

Created a rake task and cron job to run fixity checks on a regular basis. We may or may not need that code to be merged.

labradford commented 1 year ago

https://gitlab.com/notch8/cal-state-hyrax/-/merge_requests/12 PR in Github that created a cron job to automatically run the fixity check. I think the real issue here is that they are getting a false errors with existing fixity checking.

alishaevn commented 1 year ago

related pr's:

alishaevn commented 1 year ago

I didn't see the link to the gitlab pr before I created the rake tasks above. (no cron job though) I'll see what happens on their prod server with the tasks I've created before determining/discussing whether we need to add the cron job and task from the pr.

conversation with david about testing this work on prod is happening in slack.

aprilrieger commented 1 year ago

Chron job: https://github.com/scientist-softserv/scholarworks/issues/17

https://github.com/scientist-softserv/scholarworks/issues/17#issuecomment-1633531868

bkiahstroud commented 9 months ago

Error found on the server:

$ cat /var/log/hyrax/fixitycheck.log

2023-11-26 03:21:01+00:00 bundler: command not found: rails
Install missing gem executables with `bundle install`
failure
2023-11-27 03:18:01+00:00 bundler: command not found: rails
Install missing gem executables with `bundle install`
failure
bkiahstroud commented 4 months ago

Script to create works / bypass broken forms (No input found for multi_value error)

a = AdminSet.find <admin set ID>
50.times do |i|
  Dataset.create(title: ["test #{i}"], campus: ['Sacramendo'], admin_set_id: a.id, depositor: "admin@example.com") 
end
bkiahstroud commented 3 weeks ago

The PR for this was merged on Aug 8