OregonDigital / oregondigital

OregonDigital Hydra Application
https://oregondigital.org/catalog/
Other
25 stars 5 forks source link

[Remediation] building-or preflight check #1510

Closed sseymore closed 2 years ago

sseymore commented 2 years ago

collection_id: building-or

Known Issues

Related Tickets

lsat12357 commented 2 years ago

ok, here's some wacky predicates for this coll oregondigital:df67sn00s Predicate not found: http://opaquenamespace.org/ns/sheetmusic/largerWork during crosswalk oregondigital:df68gn31c Predicate not found: http://opaquenamespace.org/ns/sheetmusic/hostItem during crosswalk oregondigital:fx71bw165 Predicate not found: http://opaquenamespace.org/ns/sheetmusic/hostItem during crosswalk oregondigital:fx71c1849 Predicate not found: http://opaquenamespace.org/ns/sheetmusic/largerWork during crosswalk

lsat12357 commented 2 years ago

assuming assets have dc.type, delete these? http://purl.org/dc/elements/1.1/type "img"

sseymore commented 2 years ago

@lsat12357 For the missing identifier fields, please put pna_m000 in that field for remediation in OD2 post-lauch.

sseymore commented 2 years ago

assuming assets have dc.type, delete these? http://purl.org/dc/elements/1.1/type "img"

Yes.

oregondigital:df67sn00s Predicate not found: http://opaquenamespace.org/ns/sheetmusic/largerWork during crosswalk oregondigital:df68gn31c Predicate not found: http://opaquenamespace.org/ns/sheetmusic/hostItem during crosswalk oregondigital:fx71bw165 Predicate not found: http://opaquenamespace.org/ns/sheetmusic/hostItem during crosswalk oregondigital:fx71c1849 Predicate not found: http://opaquenamespace.org/ns/sheetmusic/largerWork during crosswalk

These statements can be deleted. Data is in other fields. These are preds for sheet music, so don't know how this happened.

sseymore commented 2 years ago

Note about remediation for missing identifier above. For missing resource_type, all are images, please add: http://purl.org/dc/dcmitype/Image.

lsat12357 commented 2 years ago

the majority of bad predicates are for creator, dce:type, rightsHolder, stylePeriod, and workType I printed out the raw statements, and the values currently in those fields, file is on google drive in the uo_pkg5 folder, "building-or_bad_preds_vals.txt" type is already slated to go away; the rest appear to be legit (there are a handful of creators that are not uris though), and don't appear to be already in the field, so if they look ok to you, I can insert them. If you can give me uris to replace the creator strings, I can insert those too.

sseymore commented 2 years ago

Worktype URI replacements:

http://id.loc.gov/authorities/subjects/sh85078076 with http://vocab.getty.edu/aat/300164014 http://id.loc.gov/authorities/subjects/sh85075320 with http://vocab.getty.edu/aat/300444926 http://id.loc.gov/authorities/subjects/sh85062603 with http://vocab.getty.edu/aat/300005425 http://id.loc.gov/authorities/subjects/sh96004672 with http://vocab.getty.edu/aat/300379282 http://id.loc.gov/authorities/subjects/sh85038731 with http://vocab.getty.edu/aat/300026030

The rest I did by hand. Working on remediating your last comment above.

sseymore commented 2 years ago

@lsat12357 There are 119 building-or works in review on OD1. https://oregondigital.org/reviewer?f%5Bdesc_metadata__institution_label_sim%5D%5B%5D=University+of+Oregon%24http%3A%2F%2Fid.loc.gov%2Fauthorities%2Fnames%2Fn80126183&f%5Bdesc_metadata__set_label_sim%5D%5B%5D=Building+Oregon%24http%3A%2F%2Foregondigital.org%2Fresource%2Foregondigital%3Abuilding-or&per_page=100&q=&search_field=all_fields

Please delete these items. They are duplicate uploads.

sseymore commented 2 years ago

@lsat12357 Ok, I think I got all of the creator strings:

Goodwin, George: http://opaquenamespace.org/ns/creator/GoodwinGeorge Carroll, William G.: http://opaquenamespace.org/ns/creator/CarrollWilliamG Sparrow, Alex: http://opaquenamespace.org/ns/creator/SparrowAlex Peters, William H.: http://opaquenamespace.org/ns/creator/PetersWilliamH Johnson, Peter: http://opaquenamespace.org/ns/creator/JohnsonPeter Banister and Banister: http://opaquenamespace.org/ns/creator/BanisterandBanister

Noticing a lot of bad URIs in creator, styleperiod ,and worktype. Remove /page/ from URI:

http://vocab.getty.edu/aat/... http://vocab.getty.edu/ulan/...

lsat12357 commented 2 years ago

these are strings in fields that require URIs oregondigital:sf268h46t Invalid URI Rosas, Marion found in crosswalk oregondigital:df67k516r Invalid URI kerby found in crosswalk oregondigital:sf268h42q Invalid URI Rosas, Marion found in crosswalk oregondigital:sf268j35h Invalid URI Clark, Liz found in crosswalk oregondigital:sf268j91v Invalid URI Delph, Sheldon G. found in crosswalk oregondigital:df67j917p Invalid URI Craftsman found in crosswalk oregondigital:df67j917p Invalid URI Colonial Revival found in crosswalk

lsat12357 commented 2 years ago

This coll is soooo close to being ready to go. Just need the above strings to be fixed and not sure what to do with this: oregondigital:df66zx714 http://www.rdaregistry.info/Elements/e/#colourContent.ed "color" there is a colorSpace property, or can this be deleted

lharka commented 2 years ago

@lsat12357 I've fixed the above strings.

I don't know enough to give input on the colorSpace property. Curious if @jsimic has thoughts regarding whether or not it can be deleted.

update re: colorSpace property: I reviewed the map and noticed that it is on the list. Hoping @lsat12357 is able to find the current OD2 predicate should be

jsimic commented 2 years ago

@lsat12357 @lharka Colorspace is now part of the file characterization, so it can be dumped from the descriptive metadata.

lharka commented 2 years ago

Oh! Good to know. Thanks, Julia!

lsat12357 commented 2 years ago

thankyou thankyou. I think this can be exported now