cwrc / islandora-etl

Islandora ETL (Extract / Transform / Load)
GNU General Public License v3.0
1 stars 0 forks source link

Temporal subject handling #2

Open ilovan opened 2 years ago

ilovan commented 2 years ago

Taxonomy reference is required : see/admin/structure/taxonomy/manage/temporal_subjects/overview

Workbench check fails on that field

ilovan commented 2 years ago

https://cwrc.ca/islandora/object/reed%3Aaeb808b5-2c93-4e94-a1f4-52e7d3be016b/datastream/MODS/version/0/view Temporal subject in this case is a date range expressed in MODS as

<subject>
<temporal point="start">1501</temporal>
</subject>
<subject>
<temporal point="end">1600</temporal>
</subject>

In the output of the transform to workbench format, it gets stored as 1501|1600 in the field_temporal_subject

jefferya commented 2 years ago

A second area of the temporal subject failing is if the subject is a year only (e.g., 1582) -- workbench interprets as a reference id instead of a label. Details of the problem: https://github.com/mjordan/islandora_workbench/issues/337#issuecomment-947704730

jefferya commented 2 years ago

The following commit helps incrementally in these cases

Failing cases:

https://github.com/cwrc/islandora-etl/blob/9a8a9e9d1ca84fa4ca83642920bd8526ad4e697b/transform_to_workbench/islandora7_to_workbench_utils.xquery#L529-L549

jefferya commented 2 years ago

@ilovan

A question if there is a temporal point start but no end

  <mods:subject>
    <mods:temporal point="start">2000</mods:temporal>
  </mods:subject>
  <mods:subject>
    <mods:temporal point="start">2001</mods:temporal>
  </mods:subject>

Is this the expected output

2000_workbench_separator_2001

Or should each have an open range like this 2000/

jefferya commented 1 year ago

Regarding the year (number only) aspect, my best idea (untested) to date:

  1. extract terms from CSV
  2. use create terms to add https://mjordan.github.io/islandora_workbench_docs/creating_taxonomy_terms/
  3. in CSV, map problematic temporal subjects (e.g., 2000) to the ID generated in the previous step
  4. ingest via Islandora Workbench