cwrc / ontology

CWRC ontology - primary repository
13 stars 7 forks source link

Review missed instances in biographic data #448

Open alliyya opened 5 years ago

alliyya commented 5 years ago

Ex. Culturalforms, occupations, education

alliyya commented 5 years ago

Spreadsheet of unmatched cultural forms and occupations (JOB/SIGNIFICANTACTIVITY) Biography Missed Instances - July 2019

alliyya commented 2 years ago

An updated sheet can be found in Biography Missed Instances - 2021

Review of the 2019 sheet is needed to see if all required tasks were completed. Once fully reviewed, I'll re-run extraction for 2021 and update the missed instances accordingly.

SusanBrown commented 2 years ago

Who needs to review it and how @alliyya ?

alliyya commented 2 years ago

I assigned it to Jasmine and Hannah. They can go through the comments of the Unmatched Cultural Forms sheet and double-check the alt labels have been added to the ontology or if the Orlando entry has been appropriately updated. Then they can consult with you on whether there are terms that still need to create. For occupations, see the 2021 spreadsheet for terms that don't currently map to the ontology.

SusanBrown commented 2 years ago

Great. In the case of missing occupation terms, do we need to consider adding a term or do we just continue to skip them, in effect?

Will this have to be done every time we extract the data?

On Nov 10, 2021, at 8:31 AM, Alliyya Mo @.**@.>> wrote:

CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to @.**@.>

I assigned it to Jasmine and Hannah. They can go through the comments of the Unmatched Cultural Forms sheet and double-check the alt labels have been added to the ontology or if the Orlando entry has been appropriately updated. Then they can consult with you on whether there are terms that still need to create. For occupations, see the 2021 spreadsheet for terms that don't currently map to the ontology.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/cwrc/ontology/issues/448#issuecomment-965141012, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAEFJIDWCIWDR3XZK4ZSU2DULJXZZANCNFSM4FXN7LEA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

alliyya commented 2 years ago

With the missing occupation, we do need to consider adding a term if an appropriate URI is not available otherwise in the CWRC extraction they will be left as strings and in the CIDOC-CWRC extraction, they will be labels for placeholder URIs.

Every time the data is updated and extraction is rerun and it comes across terms that don't appear in the ontology, this process will need to be repeated.

Ex. VW's file has an additional or tag added and the reg value/free form value doesn't map to any preexisting term. Once extraction is run on all the files, a log is printed of all the attributes that weren't able to be mapped to the ontology so that the data could be further cleaned.

JasmineDW commented 2 years ago

Note to self: any modifications to CWRC Ontology will then need to be added to SKOS vocabulary version

  1. Add missing instances
  2. Add alt-labels (Each can be done as a separate version of the skos vocabulary to move things along more quickly)