pulibrary / dspace-development

DSpace infrastructure and development resources for the Princeton University Library.
https://dspace-development.readthedocs.io/en/latest/
1 stars 0 forks source link

Reformat the Alumni Records exports #547

Closed jrgriffiniii closed 2 years ago

jrgriffiniii commented 2 years ago

The user reported the following issues for the Alumni Records XML exports:

For 2021 we ran into a problem. In some files the data does not have the same structure (amount of columns in the same file is different). We can convert it to csv and load only if the structure is consistent. See example attached – Classics2020.csv illustrates the correct format. Please let me know if you have any questions. Would it be possible to rerun the extracts for 2021 to resolve the issue?

jrgriffiniii commented 2 years ago

The XML structure seems to be identical:

2019 Export

<collection>
  <item>
    <title>Suicide in Seneca: Tragic and Stoic Perspectives</title>
    <author>Brill, Rachel</author>
    <authorid>961185466</authorid>
    <advisor>Graziosi, Barbara</advisor>
    <classyear>2019</classyear>
    <department>Classics</department>
    <url>http://arks.princeton.edu/ark:/88435/dsp016108vf082</url>
  </item>
</collection>

2020 Export

<collection>
  <item>
    <title>The Landscapes of Pithos Production at Hellenistic Morgantina</title>
    <author>Thurn, Leina</author>
    <authorid/>
    <advisor/>
    <classyear>2020</classyear>
    <department>Classics</department>
    <url>http://arks.princeton.edu/ark:/88435/dsp019z9032880</url>
  </item>
</collection>