sul-dlss / preservation_catalog

Rails application to track, audit and replicate archival artifacts associated with SDR objects.
https://sul-dlss.github.io/preservation_catalog/
Other
2 stars 2 forks source link

[EPIC] pres cat maintenance #1487

Closed jmartin-sul closed 1 year ago

jmartin-sul commented 4 years ago

Here's a meta ticket to start collecting suggestions for making Preservation Catalog easier to develop, more robust, more usable, etc. We can spawn individual actionable tickets from this for our upcoming maintenance work cycle.

Some broad categories that I think might be useful, with some starter ideas for things I know I'd like to improve:

Improvements to make development more pleasant

Functional improvements

alright, that's enough from me for now.

Please do:

Thanks all!

mjgiarlo commented 4 years ago

Great write-up! Thanks, @jmartin-sul. MoabValidationHandler and PreservedObjectHandler are my big ones. Relevant: https://github.com/sul-dlss/preservation_catalog/pull/1277

ndushay commented 4 years ago

one of the things I'm wondering about: I think the problem with refactor MVH and POH is that they are so ... big. Would it be more tractable to either do something smaller, or start with the smallest subset of stuff in the new class(es) and gradually move more and more over???

jmartin-sul commented 4 years ago

one of the things I'm wondering about: I think the problem with refactor MVH and POH is that they are so ... big. Would it be more tractable to either do something smaller, or start with the smallest subset of stuff in the new class(es) and gradually move more and more over???

i think so, or at least for POH (i don't think MVH is a ton of code, it just has interactions with consumers that are sometimes surprising, which makes refactoring and using it a pain).

but yeah, for POH, i anticipate something where we start chipping away at obvious refactoring opportunities in individual methods, as opposed to a grand plan for re-arranging it all at once. that feels both easier and less regression prone.

but also, considering the upcoming ManyCats work, i think literally just renaming PreservedObjectHandler to CompleteMoabHandler would be a nice start, because i think the latter name would be more accurate. an annoying but mechanical bunch of find/replace work.

jmartin-sul commented 4 years ago

and i think annoying but mechanical renamings have a good track record of making this codebase more intelligible (e.g. the class naming discussions we had toward the end of the original work cycle -- i'm glad we did that work when we did).

mjgiarlo commented 4 years ago

@jmartin-sul Given what @justinlittman found in google-books yesterday re: Dir.chdir not being thread-safe, we will need to change some code in prescat before we can make the jump from Resque to Sidekiq (assuming we'll be running Sidekiq in its default multi-threaded mode—it defaults to six threads per process).

Problem

DruidVersionZip#create_zip! uses Dir.chdir and this method is invoked in a background job.

Solutions

  1. Stick with Resque. Meh.
  2. Configure Sidekiq to run one thread per process. Meh.
  3. Make the following, tiny patch (HT @justinlittman: https://github.com/sul-dlss/google-books/pull/529). From:
    Dir.chdir(work_dir.to_s) do
      combined, status = Open3.capture2e(zip_command)
      raise "zipmaker failure #{combined}" unless status.success?
    end

to:

    combined, status = Open3.capture2e(zip_command, chdir: work_dir.to_s)
    raise "zipmaker failure #{combined}" unless status.success?

FWIW

I scanned sul-dlss for Dir.chdir and found not much at all. Lots of hits, to be sure, but they're mostly in binstubs, gemspecs, tests, and scripts. So other than this part of prescat, I do not foresee any other thread-safety-related surprises stemming from Dir.chdir as we make the move to multi-threaded Sidekiq across the board. cc: @sul-dlss/infrastructure-team

jmartin-sul commented 4 years ago
3. Make the following, tiny patch (HT @justinlittman: [sul-dlss/google-books#529](https://github.com/sul-dlss/google-books/pull/529)). From:
    Dir.chdir(work_dir.to_s) do
      combined, status = Open3.capture2e(zip_command)
      raise "zipmaker failure #{combined}" unless status.success?
    end

to:

    combined, status = Open3.capture2e(zip_command, chdir: work_dir.to_s)
    raise "zipmaker failure #{combined}" unless status.success?

i like option 3. thanks for the research, @mjgiarlo! filed a specific ticket for this: #1519

ndushay commented 1 year ago

@jmartin-sul can this EPIC be closed? It's over 2 years old.

ndushay commented 1 year ago

@jmartin-sul I'm closing this EPIC that is over 2 years old.

jmartin-sul commented 1 year ago

@jmartin-sul I'm closing this EPIC that is over 2 years old.

thanks! seems reasonable. also very happy to see how much of this we ended up getting done!