uwlib-cams / MARC2RDA

mapping between MARC21 and RDA-RDF
Creative Commons Zero v1.0 Universal
32 stars 2 forks source link

344 sound characteristics #153

Closed CECSpecialistI closed 1 month ago

CECSpecialistI commented 2 years ago

https://github.com/uwlib-cams/MARC2RDA/blob/main/Working%20Documents/3XX.csv

pennylenger commented 2 months ago

Hi all. For the 344 $2 mapping, does it need to mint a nomen and then use the element "has scheme of nomen"? For example, for $a, since the element "type of recording" does not have a range, how can I associate it with a nomen? And, will $0 and $1 be the identifier or URI for the value in $a? If they are URIs, can they be used directly as values and $2 be discarded? If $0 is a identifier, how can we map the source or schema?

GordonDunsire commented 2 months ago

@pennylenger: We should not use a nomen as the value of an attribute element like field 344.

Instead, use skos:Concept as the type of the minted IRI, use skos:prefLabel for the attribute value, and use skos:inScheme for the source. This is consistent with the approach used for subject field concepts.

We have to assume, I think, that subfield $1 references the concept as a rwo, and subfield $0 is the usual muddle. This is still under general discussion ...

So I would treat subfield $1 as the object value of the appropriate RDA element and ignore the other subfields. If there is no subfield $1 but there is a string value and a subfield $2 source, I would mint a skos:Concept as the object value and add the prefLabel and inScheme statements.

pennylenger commented 2 months ago

@GordonDunsire Thank you very much, Gordon. I have another question. For fields such as 336, 337, and 338, where $b is the code for the term in $a, if there is no $1 and $0 is not a URI, should $b and $a be mapped together by minting a skos as the object value, or separately? Or is there another mapping method for the code? And if rdf:datatype is used, is it more concise than minting a skos:Concept as the object value and add the prefLabel and inScheme statements?

GordonDunsire commented 2 months ago

@pennylenger: Use skos:notation for the code. To clarify, for fields that use subfield $b for a code for a term from a controlled vocabulary, and:

*** SKOS says that the notation is usually typed. We can do this by using/creating a local XML datatype based on the source given in subfield $2. The sources are coded using the LoC (Genre/Form Schemes)[https://id.loc.gov/vocabulary/genreFormSchemes.html]. We have discussed this use of the LoC source vocabularies in the context of 6XX fields.

cspayne commented 2 months ago

@GordonDunsire @pennylenger Both $a and $b are repeatable which means we can't know whether $a and $b will refer to the same concept. Will we have to mint separate concepts for each $a and $b?

pennylenger commented 2 months ago

Hi Cypress, I think we need to mint separate concepts for separate occurrences of subfield $a or subfield $b. And terms from different source vocabularies are recorded in separate occurrences of the field. 33X and 34X fields are also repeatable.

pennylenger commented 2 months ago

Which options do we use for instance data that has no value in subfield $1 and $b, but has a value in subfield $a and a value in subfield $2 that is an RDA vocabulary?

GordonDunsire commented 2 months ago

This discussion is becoming confusing; we are talking about 336, 337, and 338 under the topic 344, and the subfields have different meanings.

@cspayne: The MARC 21 manual says to use repeats of subfield $a and $b only when they have the same source. If the sources are different, repeats of the field should be used. For fields where subfield $a is a controlled term and subfield $b is a code for the term, we should expect repeats to be ordered as $a, $b, $a, $b, etc. If this pattern is found, or the patterns $a, $a (with no $b) or $b, $b (with no $a), it is safe to assume that the repeats refer to different controlled terms and codes. If the pattern is $a, $a, $b or $a, $b, $b, then we should ignore the subfield $b and not add a skos:notation to the minted concept.

@pennylenger: if the field has a value in subfield $a and an RDA vocabulary source in subfield $2, we can safely mint an IRI for the RDA or cloned-RDA value depending on the source value. However, this will require a lookup of the term in subfield $a because the IRI patterns are based on the code/notation, not the term. If the term is not the preferred RDA label, the lookup will fail and an IRI cannot be created. I do not recommend, at this stage of the project, attempting to develop maps from 'old' terms to new. In any case, I don't think any of the controlled RDA terms have changed since they were added to the pre-3R RDA Toolkit, so any variations are not RDA, irrespective of what subfield $2 says. I don't know if LoC has changed any of the cloned terms since they were published.

GordonDunsire commented 2 months ago

@pennylenger: Note also that if there are repeats of 33x fields, or of subfields within a 33x field, we are dealing with multiple unit or aggregate manifestations. A single unit manifestation can only be assigned one carrier type or one media type, and a single expression can only be assigned one content type if it is not an amalgamation expression such as a performed song. For 337 and 338, this means we have to process subfield $3 (materials specified). Good luck with that: the example in the MARC 21 manual is "liner notes", which is either a wrapper (considered to be an integral part of the carrier, such as the sleeve of a vinyl record) or an insert (considered to be a sub-unit of the carrier, such as an insert sheet of a vinyl record or a CD booklet). My recommendation is to not transform these fields if a subfield $3 is present.

cspayne commented 2 months ago

Hi @GordonDunsire and @pennylenger! I've created a 3XX discussion page where we can continue to discuss all the nuances of updating the 3XX mappings, since, as Gordon has pointed out, our conversation has expanded beyond field 344.

I've included all of the above discussion as quotes so we can try and keep everything in one place and easy to find!

pennylenger commented 2 months ago

Hi @GordonDunsire @cspayne I drafted a mapping for 344, using $a as an example. Could you please look at it and help revise any parts where my understanding might be wrong or logically inconsistent?

Look for the first $a and first $1 and first $0

If $1 is present or $0 contain “http”: [Manifestation] has type of recording [value of $1], ignore all other subfields. (Mapping for 0 if it is a identifier has not been decided)

If no $1 and $0 is not uri, but $2 (not contain “rda”): [Manifestation] has type of recording (mintedIRI)

skos:prefLabel <$a> skos:skos:inScheme <$2> If no $1 and $0 is not uri, but $2 (contains “rda”): [Manifestation] has type of recording [uri] We look up the uri from https://www.rdaregistry.info/termList/typeRec/index.html and insert the uri as value. (Did I understand it correctly? or do we need to mint a uri according to some rules?) If no $1, no $0, no $2: [Manifestation] has type of recording [“value of $a”] Look for the second $a and second $1 and second $0... Look for second field 344...
pennylenger commented 2 months ago

And how can we put $3 - Materials specified (NR) into 344 mapping?

GordonDunsire commented 2 months ago

@pennylenger: id.loc.gov has clones of the RDA value vocabularies (see the examples for field 344 in the MARC 21 manual) and these can be treated the same way as the RDA sources. For example, http://id.loc.gov/vocabulary/msoundcontent/sound can be mapped as the equivalent of http://rdaregistry.info/termList/soundCont/1001. You can therefore expand the filter to $2 contains 'rda' or 'id.loc.gov'. Furthermore, we don't need to run a lookup to get the id.loc.gov uri because it is the value of subfield $i appended to the uri domain 'http://id.loc.gov/vocabulary/msoundcontent/'.

We can go further and substitute the RDA uri for the id.loc.gov uri where there is an equivalence, if we want to output 'purer' RDA. I will propose to the RSC Technical Working Group that maps from RDA to id.loc.gov for these cloned vocabularies are added to the RDA Registry.

I assume we are not mapping subfield $j because it is not associated with the manifestation in focus.

If subfield $3 is present, I don't see how we can transform the field, because we don't know if it pertains to a sub-unit or an 'accompanying' manifestation.

tmqdeborah commented 2 months ago

@GordonDunsire said: "If subfield $3 is present, I don't see how we can transform the field, because we don't know if it pertains to a sub-unit or an 'accompanying' manifestation."

For the fields that contain manifestation characteristics (340-348) I think that ‘accompanying material’ that is referenced by a subfield $3 is for material that is always a sub-unit of a multiple unit manifestation. This applies whether or not the manifestation is an aggregate manifestation, and, in fact, I think it would be very rare if the manifestation was NOT an aggregate manifestation in this situation.

A file of LC records that contain 344 $3 at: 344$3_LC

Some examples from that file are: 010 $a 00517748 245 00 $a Introspect / $c the Quiet Room. 300 $a 1 audio disc : $b digital ; $c 4 3/4 in. 336 $a performed music $b prm $2 rdacontent $3 audio disc 336 $a text $2 rdacontent $3 lyrics 337 $a audio $2 rdamedia $3 audio disc 337 $a unmediated $2 rdamedia $3 lyrics 338 $a audio disc $2 rdacarrier $3 audio disc 338 $a volume $2 rdacarrier $3 lyrics 340 $b 4 3/4 in. $3 audio disc 344 $a digital $2 rdatr $3 audio disc 344 $b optical $2 rdarm $3 audio disc 347 $a audio file $2 rdaft $3 audio disc 347 $b CD audio $3 audio disc 500 $a Title from disc label. 500 $a Lyrics inserted in container. It looks like the subfield $3 are added to clarify (unnecessarily?) that the attributes are for the audio disc, not the ‘volume’ of the multiple unit aggregate manifestation.

010 $a 00716016 245 00 $a From the ladle to the grave / $c Boiled in Lead. 300 $a 1 audio disc : $b digital ; $c 4 3/4 in. + $e booklet ([10] pages : illustrations ; 12 cm) 336 $a performed music $b prm $2 rdacontent $3 audio disc 336 $a text $2 rdacontent $3 lyrics 337 $a audio $2 rdamedia $3 audio disc 337 $a unmediated $2 rdamedia $3 lyrics 338 $a audio disc $2 rdacarrier $3 audio disc 338 $a volume $2 rdacarrier $3 lyrics 340 $b 4 3/4 in. $3 audio disc 344 $a digital $2 rdatr $3 audio disc 344 $b optical $2 rdarm $3 audio disc 347 $a audio file $2 rdaft $3 audio disc 344 $c 1.4 m/s $3 audio disc 347 $b CD audio $3 audio disc 500 $a Title from disc label. 500 $a Lyrics in booklet.

010 $a 95781228 245 10 $a Oops! / $c Dan Crow. 300 $a 1 audio disc : $b analog, 33 1/3 rpm ; $c 12 in. + $e 1 folded lyric sheet 336 $a performed music $b prm $2 rdacontent 337 $a audio $b s $2 rdamedia 338 $a audio disc $b sd $2 rdacarrier 340 $a vinyl $2 rdamat 340 $b 12 in. 344 $a analog $2 rdatr 344 $c 33 1/3 rpm 344 $d microgroove $2 rdagw 344 $g stereo $2 rdacpc $3 audio disc 500 $a Title from disc label.

010 $a 99595651 245 10 $a Words and music / $c Samuel Beckett, [words], Morton Feldman, [music]. 300 $a 1 audio disc (41 min., 42 sec.) : $b digital, CD audio ; $c 4 3/4 in. 336 $a performed music $b prm $2 rdacontent $3 audio disc 336 $a text $2 rdacontent $3 program notes 337 $a audio $2 rdamedia $3 audio disc 337 $a unmediated $2 rdamedia $3 program notes 338 $a audio disc $2 rdacarrier $3 audio disc 338 $a volume $2 rdacarrier $3 program notes 340 $b 4 3/4 in. $3 audio disc 344 $a digital $2 rdatr $3 audio disc 344 $b optical $2 rdarm $3 audio disc 347 $a audio file $2 rdaft $3 audio disc 347 $b CD audio $3 audio disc 500 $a Radio play, for 2 speakers, 2 flutes, vibraphone, piano, violin, viola, violoncello. 500 $a Title from disc label. 546__ |a Spoken in English or undetermined language/s.

245 00 $a Howdjadoo / $c John McCutcheon. 300 $a 1 audio disc : $b analog, 33 1/3 rpm ; $c 12 in. 344 $g stereo $2 rdacpc $3 audio disc 500 $a Title from disc label. 500 $a "Howjadoo coloring book" / by Tina Liza Jones (Blacksburg, VA : Swan & Quill Pub. Co. 17 p. : ill.) including program notes and words of some of the songs, inserted in container. My research shows that the audio disc might also appear without the coloring book; so, in this case, the single 344 with a $3 is added to say that the attributes are for the audio disc, not the ‘volume’ of the multiple unit aggregate manifestation

Here are a few examples from a larger file (not retained) looking for multiple 344 without $3: 245 10 $a Road to everafter / $c Dan Oakenhead. 300 $a 1 audio disc : $b digital ; $c 4 3/4 in. 336 $a performed music $b prm $2 rdacontent 337 $a audio $b s $2 rdamedia 338 $a audio disc $b sd $2 rdacarrier 344 $a digital $2 rdatr 344 $b optical $2 rdarm 347 $a audio file $2 rdaft 347 $b CD audio 500 $a Title from disc label. 500 $a All songs written by Dan Oakenhead. 500 $a Lyrics (12 unnumbered pages : illustrations) inserted in container.

245 10 $a ...from a bleeding heart / $c Beseech. 300 $a 1 audio disc : $b digital ; $c 4 3/4 in. 336 $a performed music $b prm $2 rdacontent 337 $a audio $b s $2 rdamedia 338 $a audio disc $b sd $2 rdacarrier 344 $a digital $2 rdatr 344 $b optical $2 rdarm 344 $g stereo $2 rdacpc 347 $a audio file $2 rdaft 347 $b CD audio 500 $a Title from disc label. 500 $a Lyrics on insert. Many examples had accompanying material values (e.g., 300$e or a 500 accompanying material note) for other physical carriers, without indicating that the 344 applied to the sound recording carrier, perhaps because no other carrier characteristics (34X) fields were recorded.

Looking at these examples, it appears to me that the practice is to:

So, I suggest that (at LC at least) the $3 in a 344 does pertain to a subunit of the manifestation in focus and, therefore, “the information on materials specified can be assigned to the manifestation being described (the superunit)”. But might it be useful to also map the $3 value as either:

I have also checked the 345, 346, and 348 fields but not the 347, because I’m not sure what’s going on with LC’s use of that field. Examples of those fields are in <345_346_348$3_LC.txt>