sciencehistory / scihist_digicoll

Science History Institute Digital Collections
Other
11 stars 0 forks source link

add finding aid links to all archives department works without one #2627

Open archivistsarah opened 3 months ago

archivistsarah commented 3 months ago

TODO

Description

Currently, our metadata creation practice is that all works/items belonging to the archives department contain a publicly accessible link from the work in the digital collections to the public finding aid in ArchivesSpace for the collection the work belongs to.

But because we launched the Digital Collections well before the ArchivesSpace public user interface was in use, only archives DC works created after the ASpace PUI was in use contain that link. Approximately 5000 archives works created before the ASpace PUI launch are missing this link and archives staff would like for it to be retroactively added to their metadata.

Links to finding aids are added to metadata using the Related Link fields: Category = Finding aid URL = [something that looks like] https://archives.sciencehistory.org/repositories/3/resources/236

It appears on the front end of the Digital Collections as a visually distinct button ("collection guide") in the Institutional Location section: institutional location fields DC work

Other info: All archival collection-level works in the DC should already contain a link to their finding aid in the collection's related link field. Is it possible to have works belonging to these collections automatically inherit this link? As a one-time batch update?

All archival works in the DC should be from a fully processed archival collection that should have a public finding aid in the ArchivesSpace PUI. It is possible, however, that archival items were digitized from collections that do not yet have public records in ArchivesSpace (this shouldn't be happening, but just trying to cover my bases here).

I can generate a CSV containing the accession number (in the DC metadata this is stored in External ID: Accession No.), title, and public URL for each archives collection with a public finding aid and/or only the collections that have digitized content in the DC.

Also, ASpace has an API and OAI-PMH functionality that are options if we need them for this or other projects.

jrochkind commented 1 month ago

Let's talk about if we want to just display link from Collection a Work is part of (possibly changing workflow going forward?), or if we do want to continue adding this duplicate link to every Work, and bulk do that.

Initially at next weekly meeting?

archivistsarah commented 1 month ago

Sure, I can pop into the weekly meeting.

jrochkind commented 1 month ago

We decided to display the Finding Aid link from linked Collection, so it does not need to be entered in individual Work.

We're also going to run a report of any Works in Archive that are not linked to a public collection with a finding aid link.

eddierubeiz commented 1 month ago

@archivistsarah - in Staging, at least, the only Archives department collections in the d.c. that do not currently have external URL links to Aspace are these three:

I suspect this is just fine :)

eddierubeiz commented 1 month ago

Oh, looks like Houdry is here: https://archives.sciencehistory.org/repositories/3/resources/63

jrochkind commented 1 month ago

@eddierubeiz Can you also check for Archival Works that are not assigned to a public Archives department collection though?

The way I would do this is first complete the main work, then basically you want to look for an Archives department Work that does not have a finding aid to display under the new logic. This could be because it doesn't have a (public) Collection assigned as well as if the Collection does not have a finding aid link.

archivistsarah commented 1 month ago

@eddierubeiz correct -- those two (SHI 16mm and CHEMStudy) do not have finding aids. If this is a roadblock I can spend some time them this week?

eddierubeiz commented 1 month ago

No problem at all @archivistsarah .

eddierubeiz commented 1 month ago

Hi @archivistsarah . FYI, here's a list of works for which:

Archival works in the digital collection with no finding aid.xlsx

It's not much, roughly 200 items. Happy to discuss results with you.

eddierubeiz commented 1 month ago

Recipe for the above:

csv_string = CSV.generate do |csv|
    Work.where("json_attributes -> 'department' ?  'Archives'").each do |w|
        work_related_link = w.related_link.select { |rl| rl.category == "finding_aid" }.map(&:url).flatten.first
        coll_related_link =  w.contained_by.where(published: true).map { |col| col.related_link.select { |rl| rl.category == "finding_aid" }.map(&:url) }.flatten&.first
        csv << [w.friendlier_id, w.parent&.friendlier_id, w.title.truncate(50), w.published?, w.contained_by&.first&.title&.truncate(20), w.physical_container&.box] if work_related_link.nil? && coll_related_link.nil?
    end
end
puts csv_string;0
eddierubeiz commented 1 month ago

Just for future reference, I pasted the above code into a ruby file, then ran: cat export.rb | heroku run console --no-tty > prod_out.csv Then I imported the csv into Google Spreadsheet, then pasted the resulting table into Excel, then uploaded that to GitHub.

jrochkind commented 1 month ago

If this is a roadblock I can spend some time them this week?

Not any kind of roadblock for us, we're just making a report of every Archives work that wont' have a link to Finding Aid after this new thing, so you can make sure none of them are mistakes or problems, whether they are or not is up to you!

eddierubeiz commented 1 month ago

Yeah, there's no roadblock at all - rather, we're working on parallel tracks.

eddierubeiz commented 4 weeks ago

Metadata guidelines: The finding aid link will be stored in as few places as possible.

eddierubeiz commented 4 weeks ago
  1. @archivistsarah is going to make changes to the metadata per the above guidelines.
  2. In parallel, Eddie has the green light to modify the code so that works also take collections, parent works, and the collections of parent works into consideration when looking for a finding aid to display in the work metadata.
jrochkind commented 4 weeks ago

Just I hope @archivistsarah isn't planning on manually editing hundreds of works to, say, remove currently existing links. We can do that batch for her, right?

eddierubeiz commented 4 weeks ago

There's some piecemeal work, and Sarah can and will request any batch changes from me. It should be very manageable.

archivistsarah commented 4 weeks ago

Just I hope @archivistsarah isn't planning on manually editing hundreds of works to, say, remove currently existing links. We can do that batch for her, right?

Definitely not planning on it! I'll let Eddie know if/when that needs to happen.

I'm going to give Jahna a list of the new collections that need to be created and the parent/standalone works (without finding aid links currently) that should belong to them. She'll create the collections and add the works as this is a bit more in her responsibilities than mine.

eddierubeiz commented 3 weeks ago

When Sarah and Jahna are done with the first round of modifications, we can check their work with:

Work.where("json_attributes -> 'department' ?  'Archives'").each do |w|
    arr = WorkShowInfoComponent.new(work: w).links_to_finding_aids.to_a
    pp  [w.friendlier_id, w.parent&.friendlier_id, w.title.truncate(50), w.published?, w.contained_by&.first&.title&.truncate(20), w.physical_container&.box] if arr.empty?
end;0

That should give us a list of works for which the new recipe STILL can't find a finding aid.

eddierubeiz commented 3 weeks ago

(After we run this last step, fine to move this issue to "review".)