uwlib-cams / MARC2RDA

mapping between MARC21 and RDA-RDF
Creative Commons Zero v1.0 Universal
32 stars 2 forks source link

Areas we need to revisit when we tackle Aggregates #383

Closed CECSpecialistI closed 9 months ago

CECSpecialistI commented 1 year ago

Dealing with identity management for aggregates. List here any MARC fields/mappings we will need to revisit once we have an approach for how many/which works we will create for aggregates.

CECSpecialistI commented 1 year ago

MARC 505

CECSpecialistI commented 1 year ago

008 24-27

pan-zhuo commented 1 year ago

380 - Form of Work

CECSpecialistI commented 1 year ago

544 1st indicator 0 - Location of other archival materials note - Associated materials MARC bibliographic Spreadsheet

CECSpecialistI commented 1 year ago

008/29 conference publications

tmqdeborah commented 1 year ago

I have put the PowerPoint from last weeks meeting in our Google Drive at: https://docs.google.com/presentation/d/178yNNSwSYSI1KiLmKZr2-AgZG84F6-Bt/edit?usp=drive_link&ouid=109763205474483004874&rtpof=true&sd=true

I have added a few additional comments in the Notes for the slides with the green backgrounds and 3 additional slides at the end:

  1. Unanswered questions
  2. Ideas about identifying aggregate manifestations (AM)
  3. Ideas about quick-and-dirty transformations

Deborah

CECSpecialistI commented 1 year ago

Sita sent along a copy of Damian Iseminger's slides from a talk he gave about aggregates at the PCC OpCo meeting in 2019. The file was too big for GitHub, so I put the slides in Google Drive here.. Thank you for passing these along, @SitaKB !

tmqdeborah commented 11 months ago

With many re-starts, I am slowly working my way through a copy of every 10th record of approx. 5 million records from the LC database (as of Jan 2022) and identifying three categories of records:

  1. Aggregates: as far as I can tell, the pattern match logic (reasonably) identifies descriptions of aggregates that are primarily one of the following: a. Collection aggregates (might also be augmentation and/or parallel) b. Parallel aggregates (might also be augmentation) c. Augmentation aggregates of single expressions
  2. Possible aggregates: the pattern match logic needs further analysis
  3. Not aggregates: as far as I can tell, the pattern match logic (reasonably) identifies descriptions of single expressions without supplementary content.

To do this processing, I am using MARC Report software (now freely available for use at no charge and downloadable here)

Using this software, I am running pattern matches that produce:

  1. a text file (.txt) displaying all MARC fields of records that matched a pattern (AND, OR, NOT, NONE, or NOWHERE) or series of patterns
  2. a MARC file (.mrc) of the records that matched the pattern
  3. a MARC file of the records that did not match the pattern; this file was then used for the next pattern match in my sequence of pulls.

I am still working on how to organize and present my findings to you, but in the meantime, I have put some sample result files into an Aggregates folder in the shared folder at: https://drive.google.com/drive/folders/15eAMtPfuEozpKp5RHpHKKwOSLLw9qN2O In that folder, you will also find: • FindingAggregates.20231023.xlsx: a spreadsheet that outlines the pattern matches I have used (so far), and some of the others that I plan to use. • Lists of terms for pattern matching:

  1. LCGFT: LCGFT Terms that might identify aggregates—chosen from https://www.loc.gov/aba/publications/FreeLCGFT/GENRE.pdf

  2. Non Music CCT: Non Music Conventional Collective Titles that identify aggregates—chosen from terms found in 130 or 240 fields in records in the LC database from Jan 2022

  3. Music CCT: Music Conventional Collective Titles that identify aggregates—chosen from terms found in 130 or 240 fields in records in the LC database from Jan 2022

Please add appropriate terms that you think might help to identify aggregates. Add a comment if you question any of my choices.

I am sharing this material to give you an idea of what I hope to present in a much more coherent form to, hopefully, assist us in discussion how we can identify and transform appropriate MARC records into descriptions of aggregate manifestations and the appropriate expressions/works that they embody.