ucldc / rikolti

calisphere harvester 2.0
BSD 3-Clause "New" or "Revised" License
7 stars 3 forks source link

`marc.ucb_tind` validation results #1154

Closed barbarahui closed 9 hours ago

barbarahui commented 1 day ago

mapper: marc.ucb_tind

Issue

creator is mapping from 700 subfield a (name), but not 700 subfield e (role) see: 27648, 27572, 26693, 150, 7875, 11237

Note that often, there are multiple Creators listed in each record. Here is what it might look like in a given source record:

<marc:datafield tag="700" ind1="1" ind2=" " >
<marc:subfield code="a" >Perti, Giacomo Antonio, 1661-1756.</marc:subfield>
<marc:subfield code="e" >Composer.</marc:subfield>
</marc:datafield>
<marc:datafield tag="700" ind1="1" ind2=" " >
<marc:subfield code="a" >Silvani, Francesco.</marc:subfield>
<marc:subfield code="e" >Librettist.</marc:subfield>
</marc:datafield>
<marc:datafield tag="700" ind1="1" ind2=" " >
<marc:subfield code="a" >Carpio, Gaspar de Haro, marchese del, 1629-1687.</marc:subfield>
<marc:subfield code="e" >Dedicatee.</marc:subfield>
</marc:datafield>

https://digicoll.lib.berkeley.edu/oai2d?verb=GetRecord&metadataPrefix=marcxml&identifier=oai:digicoll.lib.berkeley.edu:132824

What is desired

Option A: We would like both 700a and 700e to map, in this format: 700-a, 700-e (Name, Role) 700-a, 700-e (Name, Role) 700-a, 700-e (Name, Role)

So with a comma separating Name & Role. Is this formatting option possible? Here's how it looks in the UCB record: https://digicoll.lib.berkeley.edu/record/132824?v=pdf

Option B: Note that previously, legacy harvester listed Name & Role out in an alternating pattern, such that: Name Role Name Role Name Role

Here's how it looks in -stage via the legacy harvester: https://calisphere-stage.cdlib.org/item/657c461945c423f2bce031b401b3fbeb/

If the first option isn't possible, this alternating list would also be fine.

barbarahui commented 1 day ago

@christinklez The ucb tind mapper is currently mapping values from 700-a and 710-a. I will add 700-e. Do you also want to add 710-e?

christinklez commented 1 day ago

Hi @barbarahui! Yes, let's also do 710-e.

I just found a record example that uses both 700-a, 700-e, and 710-a, and 710-e https://digicoll.lib.berkeley.edu/oai2d?verb=GetRecord&metadataPrefix=marcxml&identifier=oai:digicoll.lib.berkeley.edu:122329

Thank you!

barbarahui commented 1 day ago

@christinklez Another question: re mapping pairs of name, role -- what do you want to happen if there is an 880 value?

christinklez commented 1 day ago

I think ideally something like:

700-a, 700-e (or 710) 880-a, 880-e 700-a, 700-e 880-a, 880-e

I skimmed through all the validation reports and I'm only finding one collection that supports both 7x0 paired with 880s here: 27571 These 710s and 880s seem to be pairing and alternating nicely. I wasn't seeing 710-e's in these records.

Let me know if I can help clarify or track down any other information!

christinklez commented 11 hours ago

Also, I haven't seen any 880-e's used in this way yet (paired with 700-e or 710-e), so I'm unsure if we have a good way of testing/reviewing this. We can optionally come back to this when we have this scenario come up in a collection? Let me know what you think make sense. Thank you!!

barbarahui commented 10 hours ago

@christinklez OK, the potential existence of 880 values makes implementing the pairing of a and e and having everything in the right order pretty gnarly. Would it be OK to just have things appear as they did in the old harvester, i.e.:

Name Role Alternate Value (880) Name Role Alternate Value (880) Name Role Alternate Value (880)

If no Role and/or 880 value exists, those will not appear, obviously.

christinklez commented 9 hours ago

@barbarahui I think what you propose makes sense! Especially with all the work that's already gone into pairing the 880 fields.

Let's do you as you propose. In the off chance there is an 880 associated with 700/710-e (though I have not seen this yet in a collection) which order should we expect?

Name Role Alternate Value Name (880) Alternate Value Role (880) (etc.)

-OR-

Name Alternate Value Name (880) Role Alternate Value Role (880) (etc.)

Just trying to understand the ordering logic. Thanks!

barbarahui commented 9 hours ago

@christinklez Ah, good question! It would be:

Name Alternate Value Name (880) Role Alternate Value Role (880)

christinklez commented 9 hours ago

That sounds great, thanks so much @barbarahui !!

barbarahui commented 9 hours ago

@christinklez OK, the code is deployed and the e subfield values for 700 and 710 should now be included for creator

christinklez commented 7 hours ago

A quick update that we reviewed the reports and they look great! Thank you so, so much @barbarahui !!