Closed rmjaffe closed 1 year ago
@tmariamora @katedundon As I go about assigning collection-level subject terms, I plan to refer to the finding aids on OAC. Are LCSH subject terms generally found in the collection details, or might they appear in other sections of the finding aids?
@rmjaffe you are correct- subject terms in OAC finding aids are in Collection Details under Indexing Terms (or sometimes the heading Subjects and Indexing Terms). They can also be found in the collection level catalog record.
@NedHenry I just went into the Harry Mayo collection to add subject terms to the collection level metadata, but it doesn't look like there's the option. When clicking Edit Collection, the Title, Abstract/Summary and Thumbnail properties are exposed. And when I clicked on Additional Fields, as few more options revealed themselves, including a Keyword property, but none of our controlled subject properties: subjectName, subjectPlace, subjectTopic, subjectTitle.
Can these properties be added to the collection editing form? Would that be an easy change? If so, I can make a ticket for the work. If it isn't, I'm thinking we and the rest of the team should talk about what it would mean to use the keyword property instead.
@snehagunduraoUL @rschwab Question for you: I just opened the collection metadata editing form for the Steven Rees collection and noticed that the form looks different than the editing for works (thanks, N8). Specifically, it does not look like one can search and select to add controlled values for subject and other metadata properties. If I'm wanting to add controlled terms (e.g. names from LC or locations from Geonames), can I enter the exact string in the collection editing form, do I add the URI, or if I were to add a string in this form would it add it to the local vocabulary instead of recognizing it as belonging to a controlled vocabulary? In a cases where I would want to add controlled values, would it better to round trip than to add them manually?
Editing form for Steve Rees collection metadata:
Editing form for random work in the DAMS:
Ok I was finally able to test this out. Here's what I found:
Creator field creator is not on collection edit form creator exports as URI (this is how it should work) creator displays on collection dashboard page (example)
Using subject person as controlled vocab example
subject person exports as whatever is entered into the collection edit form
ex: https://id.loc.gov/authorities/names/n79056359 | #<ActiveTriples::Resource:0x0000000009c8e8b8> | Kofksy, Frank
subject person is not visible anywhere in the front-end AFAIK
Round tripping subject person did not alter the values at all ie they remained as:
https://id.loc.gov/authorities/names/n79056359 | #<ActiveTriples::Resource:0x0000000009c8e8b8> | Kofksy, Frank
I think what all this shows is that controlled vocabulary fields are broken on the collection edit form. To figure out the extent of what this means I think we need to create a place on the front end to display these values - this can be a code change that only lives on sandbox and is then reverted, as we want to test this but not actually alter our front end.
@rschwab Disappointing but not surprising that this is the case. Insofar as how the controlled terms are indexed, what is being indexed? The URIs or the labels? Let's say it indexes URIs, if I added URIs to subjectPlace, subjectName, etc. using the collection metadata editing interface, then those would be indexed along with the URIs that had been entered on the work level. Or vice versa, if it's indexing the strings, if I entered strings in the form, would it index them along with the string values on the works?
To affirm what already may be known/assumed: Most of the metadata values entered at the collection level are entered there for two reasons: faceting and/or to be inherited by works. Only the values that currently display on the front end need to display.
And apologies for my lack of understanding here -- the indexing has long been a black box. Before continuing to investigate this, would it make sense to put together a new ticket for 1) getting the values to display in sandbox and 2) doing more testing?
@snehagunduraoUL What do you think of the recent comments here? I haven't figured this out yet, but here's the relevant commits that were made to enable this feature on the collection edit form: https://github.com/UCSCLibrary/ucsc-library-digital-collections/commits/master/app/forms/hyrax/forms/collection_form.rb
I think there's several things we need to do to resolve these issues:
This code looks relevant as well: https://github.com/UCSCLibrary/ucsc-library-digital-collections/commits/master/app/assets/javascripts/hyrax
@rschwab In terms of supplying the collection-level subject terms, I was going to source those from the corresponding catalog records in UC Library Search. As far as testing to see if round tripping works a solution, I can easily plug them (string or URIs) into a spreadsheet for round tripping.
Would it make best sense to try to export the collection metadata records? Or should I create a spreadsheet with just the Hyrax IDs, the subject properties, and the values therein?
Whichever is easiest for you. Just note that the required fields are:
Plus whatever data you're trying to change. I think for controlled vocab terms the URI would be more consistent than using a label, but I haven't done a lot of testing on that.
As curious as I am about exporting, I'll just create a spreadsheet from scratch. I blocked time do that on Monday afternoon.
@rschwab Round tripping spreadsheet is in the DAMS shared drive: https://docs.google.com/spreadsheets/d/1rMoYpcoAi6v_Vqd9IdLVR74jtlU4BFGA/edit?usp=sharing&ouid=114117346070027385497&rtpof=true&sd=true
SubjectName and likely all the controlled vocabulary terms added to the collection edit form are not being indexed properly. Here is an example search demonstrating that the controlled term is not indexed for the Steve Rees collection.
@rschwab Do you recommend creating another ticket for getting the subject properties to index correctly? Could whatever mechanism is enabling creator to work properly be easily applied to the subjects?
Yes we'll need a ticket but I'm still exploring the nuance here. I just ran an import on a new collection and those were indexed correctly.
Test findings: Imported new collection with controlled terms for creator, person, and place
#<ActiveTriples>
format)Exported collection
Roundtripped collection
It appears the troubles are limited to those records with the #<ActiveTriples>
format for controlled terms. Perhaps the roundtrip would fix the issue. So far I'm unable to reproduce the steps to get a term in the #<ActiveTriple>
format, it appears to be the format when these terms were created during a BulkOps import.
Can we manually delete those ActiveTriples from the couple collections that have subject values using the editing form in the UI? Would it work to do that and then round trip to add those correct versions of those values (and all the other values) back in?
This is broken for works according to #512 but I tested for the collection edit form on sandbox and could successfully remove an ActiveTriples value.
So yes, you can probably do that, or round tripping without using enumerated columns should also overwrite any values currently in there.
@rschwab I was just playing with the collection metadata records in sandbox, but difficult to know the result of what I'm doing as there no full display of the collection metadata apart from the editing form itself. Also noting now the piece about the subjectPlace values not being indexed or faceted. We definitely want them indexed and faceted. Is this a quick edit to the configuration file, or should I create a ticket for that issue?
This ticket is still making my head spin; I booked at time for us on Thursday to talk about it -- unless we decide we don't need to!
Here's a summary of my current understanding of this status of all issues in this thread:
#<ActiveTriples>
in the edit form and export sheets, these are legacy values from BulkOps and should be removed.Edit: Crossed out 4, these values could have been coming from some of the recent code surrounding controlled vocabularies, and may already be fixed. At minimum, they are not necessarily leftovers from BulkOps.
Terms have been added in production; but if for any reason we edit and save the collection metadata records, the values will republish as active triples. Until the collection editing form is fixed, any updates to collection records must be done via round tripping.
Descriptive Summary
In order to enable collections to be matched when searching, the collection level records need to contain subject terms. Collections will need to be subject analyzed and/or subject terms assigned to the collection's MARC finding aids will be copied into the DAMS records.
Background
This approach is much simpler and more straight forward than pulling subject terms assigned to works with in the collection to be pulled up to the collection level.
Acceptance Criteria
This is what done looks like:
Related Work
Enhance collection metadata editing form UCSCLibrary/dams_project_mgmt#435 need to be completed before this can be done.