ycba-cia / blacklight-collections2

5 stars 2 forks source link

index creator with LOC NAF formatting #433

Closed yulgit1 closed 1 year ago

yulgit1 commented 1 year ago

addressed: https://github.com/ycba-cia/coboat-github/commit/2685a4b93072e935095c7b38952141402ffb2147 https://git.yale.edu/ermadmix/ycba_xslts/commit/6760b5c941de7196cda1fa18ace0472391788548

edgartdata commented 1 year ago

The updated LIDO XML for Turner's Wreckers now shows the Library of Congress Name Authority File (LOC NAF) format for Turner's name: Turner, J. M. W. (Joseph Mallord William), 1775-1851

This is what the added LIDO XML line looks like:

Turner, J. M. W. (Joseph Mallord William), 1775-1851 ![LOC NAF artist name example in LIDO XML](https://user-images.githubusercontent.com/4093644/214587702-f8535f81-1464-4f6d-89a4-e97f466edd25.png) The plan is to vend this name format for all TMS artists to the BlackLight Creator facet so that they are reconciled with the names for the artists coming from the RBM and ref Lib collections. We hope this will make this facet easier to use.
edgartdata commented 1 year ago

@yulgit1 Is the plan to still show name reconciliation in DEV?

flapka commented 1 year ago

@edgartdata Most artist name information from LoC should align well with our artist information in TMS. But rarely -- especially for lesser-known artists -- the LoC information may be incomplete or incorrect. If we come across discrepancies, let me know. In many cases, we may be able to improve/correct the LoC records.

edgartdata commented 1 year ago

@flapka I will let you know if I come across LOC records that do not reflect our current TMS artist information.

Also amazing that you/RBM can improve LOC records! Is there a way for you to generate a report of YCBA Orbis records that have changes to the artists names? Eric built such a report for us in TMS and it is very helpful. I am acutely aware that our current name reconciliation plan relies heavily in CIA/RBM communication :)

flapka commented 1 year ago

@edgartdata Sorry, I may be missing something obvious, but it's unclear what's meant by "YCBA Orbis records that have changes to the artists names" Could you phrase it another way?

edgartdata commented 1 year ago

I was thinking about cases when RBM makes improvements or corrections to artists information for RBM records. Is there a way for you to share these changes with me so that I can change the Creator LOC names in TMS. Happy to chat on the phone if it's easier.

flapka commented 1 year ago

Ah, now that makes perfect sense. The most recent example that springs to mind is our enhancement of the LoC record for Francis Barlow, to correct his life dates (based on research by Nathan Flis): http://id.loc.gov/authorities/names/n84233924

Such changes are relatively infrequent, so sharing them with CIA should require little effort (just have to remember to do so!).

edgartdata commented 1 year ago

@flapka And I've entered Barlow's life dates suggested by Nathan in TMS a few years back as well so it's great they are in synch across our collections. I will think of other cases like that that I know of in TMS and share them with you.

I just entered his LOC NAF name and URI in TMS so we should see reconciliation on this artist tomorrow.

Question: knowing that it will take CIA several weeks to enter the LOC NAF URIs in TMS, do we want to wait until CIA is done or slowly release the reconciled names to the production facet for artists/creators?

Also can we come up with a better name for the Creator facet?

flapka commented 1 year ago

@edgartdata I don't have an opinion on the slow versus all-at-once question. Either way seems fine.

With apologies for my confusion: Why do we need to rename the Creator facet? I may have missed something.

yulgit1 commented 1 year ago

@flapka, One reason for thinking out the naming of the facet - there is an outstanding question of whether to keep both facets, the harmonized LOC name, and the qualified creator from TMS. For example if you check out the interface https://ycba-collect-creatorloc-otl1lr.herokuapp.com and drill down to the ~3000 Turners in the CreatorLOC facet, you'll still see a great number of facet entries in the original qualified Creator facet, (mostly because of multiple creators in a record). But the 2 facets might be very useful in tandem this way, especially with the qualifiers, even if not entirely obvious to the user.

edgartdata commented 1 year ago

@yulgit1 In order to help us decide if we should keep both facets (for the record, I would prefer only having one, namely the one with the reconciled/harmonized names), can you please let us know if it might be possible to have a drop down in the facet that would show us the names with their attribution qualifiers?

edgartdata commented 1 year ago

The other outstanding question to the group is: In an object record, such as this print made after Turner, what do we want to happen when a user clicks on “after Joseph Mallord William Turner, 1775–1851, British”? Should this link return only the works after Turner? Or should it return all the works by Turner, whether they are by him directly or attributed to him or after him?

@yulgit1 I wonder if it might be possible to make a distinction between attribution qualifiers (controlled vocabulary of fewer than 10 terms) and production related roles (such as 'print made by')? If yes, then maybe an answer to my question would be, when clicking on “after Joseph Mallord William Turner, 1775–1851, British” to only return works after Turner, but ignore the production related roles (based on the fact that there is a facet to narrow results by collection/classification) and return all works directly by Turner.

yulgit1 commented 1 year ago

@edgartdata - Regarding your questions above: (1) see the Creator Pivot facet I just added - https://ycba-collect-creatorloc-otl1lr.herokuapp.com, I wish it was collapsible, unfortunately that option is not available in the version blacklight we are using. (2) Two possibilities I can think of, 1) rely on Role to specify production related roles, leaving prefix to only contain the short list of attribution qualifiers, 2) delve into the attributes in the cross reference characteristics tab

edgartdata commented 1 year ago

@yulgit1 Thanks for the Creator Pivot facet. I'll look at your calendar to talk about the various options.

edgartdata commented 1 year ago

@flapka Would it be possible to create a LoC record for William Henry Millais, 1828-1899?

flapka commented 1 year ago

When a detailed record has a creator field such as this:

Creator: Attributed to John Constable, 1776–1837, British

Would it be possible to keep the text as it is but alter the hyperlinking, to something like this, where it excludes the qualifier and links on LOC facet name:

Creator: Attributed to John Constable, 1776–1837, British

If so, would that be desirable?

yulgit1 commented 1 year ago

@edgartdata and I were talking about this scenario. In thinking further it might be somewhat possible if the qualifier and creator were separate fields. From a LIDO:

Formerly John Constable, 1776–1837, British John Constable Constable, John, 1776-1837 Formerly We would have to stop using the displayActorInRole like above and instead use either the preferred or LOC NAF. But then the issue would arise that the preferred doesn't have life dates and nationality, and wouldn't reconcile with the MARC (the MARC uses LOC NAF correct?) But then if we used NOC NAF, that doesn't exist for every creator, so you'd end up having to fall back on preferred anyway in those cases to stay in sequence with the qualifiers, and you would have to index the fallback case together with the LOC NAF to make the facets work, yet still reconcile with the MARC.
yulgit1 commented 1 year ago

Towards normalizating the creator field - indexing attrib_qual_ss and loc_nav_author_ss separately in the same sequence. When an loc_naf doesn't exist fallback on the preferred name. loc nav_author_ss in this form will be the facet. On the item page the attrib_qual_ss will proceed the loc_nav_author_ss. The attrib_qual_ss will be unlinked, while the loc_nav_author_ss will be linked to the facet.

https://git.yale.edu/ermadmix/ycba_xslts/commit/0bbece2f5b195000d9e91f0a48cca2ec0d0aa490

Also leaves the option open to have an attrib_qual_ss facet, and link to facet for attriub_qual_ss on the item page. So one could drill down using these 2 fields in tandem. (ex: facet on 'formerly by' and 'Turner, J. M. W. (Joseph Mallord William) 1775-1851' at the same time.

Indexing overnight. Test blacklight forthcoming.

edgartdata commented 1 year ago

This Met example is something we aim to implement: https://www.metmuseum.org/art/collection/search/367843?what=Prints&ft=after+turner&offset=0&rpp=40&pos=1

yulgit1 commented 1 year ago

@edgartdata - See Creator facet, and Creator on item pages:

https://ycba-collect-creatorloc-otl1lr.herokuapp.com/ https://ycba-collect-creatorloc-otl1lr.herokuapp.com/catalog/tms:26091 https://ycba-collect-creatorloc-otl1lr.herokuapp.com/catalog/tms:23 https://ycba-collect-creatorloc-otl1lr.herokuapp.com/tms:61394

yulgit1 commented 1 year ago

In looking above, I see that the fallback to preferred name doesn't have life dates like the loc-naf. So concatened the life dates to the preferred name where they exist. Indexing overnight, to check tomorrow.

https://git.yale.edu/ermadmix/ycba_xslts/commit/9a63b251e785dcfa73d0ba0d3c5a966b06507f94

edgartdata commented 1 year ago

@yulgit1 Your link to https://ycba-collect-creatorloc-otl1lr.herokuapp.com/tms:61394 does not work?

yulgit1 commented 1 year ago

sorry typo: https://ycba-collect-creatorloc-otl1lr.herokuapp.com/tms:61394 should be https://ycba-collect-creatorloc-otl1lr.herokuapp.com/catalog/tms:61394

edgartdata commented 1 year ago

Update: Eric to concatenate the TMS Display Bio field (that contains the display life dates with qualifiers) and the preferred natural language order name separated by a column. This concatenation will be used in the detailed BL records for TMS collection items. It will also be a hyperlink (minus the attribution qualifier or production role qualifier) to query all works by the artist across all YCBA collections. Looking forward to seeing this in DEV!

yulgit1 commented 1 year ago

updated coboat: https://github.com/ycba-cia/coboat-github/commit/d9510bad62cd9c946adc0a8f7ad43748ee98fe39 should be in harvester-test tomorrow

edgartdata commented 1 year ago

Great news: Eric's concatenation of the TMS preferred name, TMS Display Bio worked.

(Note: the TMS Display Bio field is newly added to LIDO XML for this purpose but not as its own separate LIDO element per the deficiency of standard structure as far as display bio/life dates are concerned).

https://harvester-test.britishart.yale.edu/oaicatmuseum/OAIHandler?verb=GetRecord&identifier=oai:tms.ycba.yale.edu:11722&metadataPrefix=lido

<lido:nameActorSet>
<lido:appellationValue lido:pref="alternate" lido:label="LOC NAF">Turner, J. M. W. (Joseph Mallord William), 1775-1851</lido:appellationValue>
<lido:appellationValue lido:pref="alternate">JMW Turner</lido:appellationValue>
<lido:appellationValue lido:pref="preferred">Joseph Mallord William Turner</lido:appellationValue>
<lido:appellationValue lido:pref="alternate">J. M. W. Turner R. A.</lido:appellationValue>
<lido:appellationValue lido:label="Alpha Sort" lido:pref="alternate">Turner Joseph Mallord William, 1775–1851</lido:appellationValue>
<lido:appellationValue lido:label="Display" lido:pref="alternate">Joseph Mallord William Turner, 1775–1851</lido:appellationValue>
</lido:nameActorSet>

Next steps, to do the same in harvester-bl and index.

yulgit1 commented 1 year ago

Need to rerun harvester-test to handle when no display dates:

https://github.com/ycba-cia/coboat-github/commit/29e3da6fb680b74cc3a1e016feb02e90db781c86

<lido:appellationValue lido:label="Alpha Sort" lido:pref="alternate">unknown artist, </lido:appellationValue>
<lido:appellationValue lido:label="Display" lido:pref="alternate">unknown artist, </lido:appellationValue>
edgartdata commented 1 year ago

Looks like it worked:

<lido:nameActorSet>
<lido:appellationValue lido:pref="alternate">Unknown artist</lido:appellationValue>
<lido:appellationValue lido:pref="preferred">unknown artist</lido:appellationValue>
<lido:appellationValue lido:label="Alpha Sort" lido:pref="alternate">unknown artist</lido:appellationValue>
<lido:appellationValue lido:label="Display" lido:pref="alternate">unknown artist</lido:appellationValue>
</lido:nameActorSet>
yulgit1 commented 1 year ago

This is ready for evaluation.

https://ycba-collect-creatorloc-otl1lr.herokuapp.com/catalog/tms:8057

https://ycba-collect-creatorloc-otl1lr.herokuapp.com/catalog/orbis:7988401

Maybe check a few more, especially TMS objects with multiple qualifier/creators(FL LN, dates), ORBIS with multiple creators (LN, FN,dates) and their links to the shared TMS/ORBIS facets.

Let me know when ready to move to production.

yulgit1 commented 1 year ago

deployed to production

edgartdata commented 1 year ago

Hi all, great news! We now have a simplified Creator facet in place.

The names of all the creators and authors represented in all the YCBA collections now appear in the Creator facet as reversed natural language order (last name, first name, life dates) without duplications.

There are 2 additional features:

flapka commented 1 year ago

Nice work!

One glitch: in TMS records, life dates are separated by an en dash, or something similar, e.g. Romney, George, 1734–1802; in library and archives records, the life dates are separated by a hyphen, e.g. Romney, George, 1734-1802.

When both forms exist they show up as separate entries in the creator facet, and the corresponding links return only TMS or library collections, depending on the form clicked.

The problem goes away for TMS creators for whom the LoC heading has been added.

edgartdata commented 1 year ago

@flapka Thanks for your feedback Francis. The issue you describe will go away as I add more LoC headings in TMS as alternate names.

yulgit1 commented 1 year ago

Also for ones without loc-naf, the facet will pull from the TMS Display Bio field, with has em dashes. These should become hyphens too.

edgartdata commented 1 year ago

Curatorial wanted en-dashes so I think the only solution is for me to keep adding the LoC names to TMS :(

edgartdata commented 1 year ago

Closing this issue knowing that we deployed the solution in prod earlier this week. CIA will keep entering LoC NAF names in TMS over the next few weeks for all artists that have LoC NAF preferred headings.