load collection-level records to Alma in bulk - Githubissues

pulibrary / aspace_helpers

methods and reports to support common SC activities in ArchivesSpace

1 stars 0 forks source link

load collection-level records to Alma in bulk #2

Closed regineheberlein closed 2 years ago

regineheberlein commented 3 years ago

export from ASpace API validate/transform/convert load to Voyager

regineheberlein commented 3 years ago

delete invalid 049 add 001 add 003

regineheberlein commented 3 years ago

Feedback from Mark: change 040 to ‡a NjP ‡b eng ‡e dacs ‡c NjP delete 852 add empty $a to 544

regineheberlein commented 3 years ago

Further feedback from Mark: create 046 from 008

regineheberlein commented 3 years ago

Feedback from Faith: This can wait until after the Alma migration

regineheberlein commented 2 years ago

See https://github.com/pulibrary/pulfalight/issues/540

regineheberlein commented 2 years ago

Our MARC endpoint returns invalid quasi-MARC. This seems to be a problem relating to our setup, as the 3.1.1. sandbox behaves ok. Will need to follow up.

<hash>
 <collection>
  <record>
   <leader>00000npcaa2200000 u 4500</leader>
   <controlfield>
    <__content__>220126i19222016xx         eng d</__content__>
    <tag>008</tag>
   </controlfield>
   <datafield type="array">
    <datafield>
     <subfield type="array">
      <subfield>
       <__content__>eng</__content__>
       <code>b</code>
      </subfield>
      <subfield>
       <__content__>Finding aid content adheres to that prescribed by Describing Archives: A Content Standard.</__content__>
       <code>e</code>
      </subfield>
     </subfield>
     <ind1> </ind1>
     <ind2> </ind2>
     <tag>040</tag>
    </datafield>

regineheberlein commented 2 years ago

fixed: marc_record = @client.get("/repositories/4/resources/marc21/2065.xml") marc_document = Nokogiri::XML(marc_record.body) puts marc_document.to_xml

regineheberlein commented 2 years ago

Great cheat sheet: https://gist.github.com/carolineartz/10276637

regineheberlein commented 2 years ago

treating this as an epic; creating individual tickets

regineheberlein commented 2 years ago

current output for a single record:

<collection xmlns="http://www.loc.gov/MARC21/slim" xmlns:marc="http://www.loc.gov/MARC21/slim" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">
  <record>
    <leader>00000npcaa2200000 u 4500</leader>
    <controlfield tag="001">AC424</controlfield>
    <controlfield tag="003">PULFA</controlfield>
    <controlfield tag="008">220222i20132014xx                  eng d</controlfield>
    <datafield ind1=" " ind2=" " tag="040">
      <subfield code="a">NjP</subfield>
      <subfield code="b">eng</subfield>
      <subfield code="e">dacs</subfield>
      <subfield code="c">NjP</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="041">
      <subfield code="a">eng</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="046">
      <subfield code="a">i</subfield>
      <subfield code="c">2013</subfield>
      <subfield code="e">2014</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="099">
      <subfield code="a">AC424</subfield>
    </datafield>
    <datafield ind1="1" ind2=" " tag="100">
      <subfield code="a">Klionsky, Abigail,</subfield>
      <subfield code="e">Creator.</subfield>
      <subfield code="4">cre</subfield>
    </datafield>
    <datafield ind1="1" ind2="0" tag="245">
      <subfield code="a">Abigail Klionsky Oral History Collection on Jewish Student Life at Princeton,</subfield>
      <subfield code="f">1979-2014</subfield>
      <subfield code="g">2013-2014</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="300">
      <subfield code="a">32</subfield>
      <subfield code="f">items</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="351">
      <subfield code="a">Materials are arranged alphabetically by interviewee last name.</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="500">
      <subfield code="a">Location of resource: mudd.</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="500">
      <subfield code="a">Physical Characteristics / Technical Requirements: This collection consists of PDF files. Researchers are responsible for meeting the technical requirements needed to access these materials, including any and all hardware and software.</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="500">
      <subfield code="a">Processing Information: This collection was processed by Rossy Mendez in 2015. Finding aid written by Rossy Mendez in 2015. University Archives staff edited the creator's original transcripts for clarity.</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="506">
      <subfield code="a">The collection is open for research use. Original audio files exist but are restricted.</subfield>
    </datafield>
    <datafield ind1="3" ind2=" " tag="520">
      <subfield code="a">Abigail Klionsky is a member of the Princeton University undergraduate Class of 2014 who undertook an oral history project on Jewish student life at Princeton as part of her senior thesis. The collection consists of fifteen transcripts of Klionsky’s interviews with Jewish alumni and also includes a copy of a transcript of Henry Morgenthau III’s interview with David Frisch in 1979.</subfield>
    </datafield>
    <datafield ind1="2" ind2=" " tag="520">
      <subfield code="a">The collection consists of fifteen transcripts of Klionsky’s interviews with Jewish alumni and also includes a copy of a transcript of Henry Morgenthau III’s interview with David Frisch in 1979. The interviews address several aspects of Jewish life both within and outside of Princeton, including Jewish upbringing, attendance to Jewish services-- particularly during the chapel requirement and high holidays-- colloquiums, the bicker process, and the demographic of the various eating clubs. Interviews with more recent alumni address the beginnings of kosher dining and the development of the Hillel, which later became the Center for Jewish Life. Other topics include the interactions with administrators, faculty and other affiliated individuals such as President Harold W. Dodds, Dean Christian Gauss, Rabbi Irving Levey and Albert Einstein. Lastly, the interviews include details of post-graduation involvement with Jewish life.</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="524">
      <subfield code="a">Abigail Klionsky Oral History Collection on Jewish Student Life at Princeton, Name of Digital File, Princeton University Archives, Special Collections, Princeton University Library.</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="540">
      <subfield code="a">Single photocopies may be made for research purposes. For quotations that are fair use as defined under  U. S. Copyright Law , no permission to cite or publish is required. The Trustees of Princeton University hold copyright to all materials generated by Princeton University employees in the course of their work. If copyright is held by Princeton University, researchers will not need to obtain permission, complete any forms, or receive a letter to move forward with non-commercial use of materials from the Mudd Library. For materials where the copyright is not held by the University, researchers are responsible for determining who may hold the copyright and obtaining approval from them. If you have a question about who owns the copyright for an item, you may request clarification by contacting us through the  Ask Us! form .</subfield>
    </datafield>
    <datafield ind1="1" ind2=" " tag="541">
      <subfield code="a">The collection was transferred to the University Archives in March of 2014.</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="544">
      <subfield code="a"/>
      <subfield code="d">Test.</subfield>
    </datafield>
    <datafield ind1="1" ind2=" " tag="583">
      <subfield code="a">No materials were separated from the collection at the time of accessioning.</subfield>
    </datafield>
    <datafield ind1=" " ind2="0" tag="650">
      <subfield code="a">Universities and colleges -- Alumni and alumnae -- New Jersey -- Princeton.</subfield>
    </datafield>
    <datafield ind1=" " ind2="0" tag="650">
      <subfield code="a">Minorities -- Education (Higher) -- United States.</subfield>
    </datafield>
    <datafield ind1=" " ind2="0" tag="650">
      <subfield code="a">Jewish students –New Jersey.</subfield>
    </datafield>
    <datafield ind1=" " ind2="7" tag="655">
      <subfield code="a">Oral histories.</subfield>
      <subfield code="2">aat</subfield>
    </datafield>
    <datafield ind1=" " ind2="7" tag="655">
      <subfield code="a">Born digital.</subfield>
      <subfield code="2">aat</subfield>
    </datafield>
    <datafield ind1="4" ind2="2" tag="856">
      <subfield code="z">Finding aid online:</subfield>
      <subfield code="u">http://arks.princeton.edu/ark:/88435/4t64gq81d</subfield>
    </datafield>
  </record>
</collection>

regineheberlein commented 2 years ago

lib-sftp

regineheberlein commented 2 years ago

format 001 and 003 for 035$a as "(PULFA)call_no" to allow 035 match (we don't need 001 and 003)

regineheberlein commented 2 years ago

Get the value of the 500 that has the location information into any 982$c

regineheberlein commented 2 years ago

loan policy for ph is closed (no barcode)

regineheberlein commented 2 years ago

for subject headings:

segments = heading.split('--')
  segments.each { |s| s.strip! }
  subfa_text = segments[0]
  field.append(MARC::Subfield.new('a', subfa_text))

segments[1..-1].each do |segment|
    code = segment =~ /^[0-9]{2}/ ? 'y' : 'x'
    field.append(MARC::Subfield.new(code, segment))
  end
  field.subfields[-1].value << '.' unless ['?', '-', '.'].include?(field.subfields[-1].value[-1])

regineheberlein commented 2 years ago

After QA, Lynn wants some fields removed from the records; will provide a list.