Open CECSpecialistI opened 2 years ago
- See a more detailed discussion below -- ZP 3/22/2022
Order of subfields in MARC. $a has an instruction that says "for works in multiple languages, the codes for the languages of the text are recorded in the order of their predominance in the text." RDA does not seem to care about ordering at all. I wonder if MARC order should/could be retained in RDA.
Subfields for languages of the original. These belong to related expressions so I assume they should be left out of the mapping. But then in cases where 041 1# $a eng $h ger
and 546 ## Translated from the German.
are both mapped, the information in the unstructured description and identifier description will not match. And the unstructured description is not really about the languages of the expression being described and seems more suitable as a note on expression.
Obsolete data/subfields. Historically there was "the practice of placing multiple language codes in one subfield," and "an item having text in English, translated from French, subfield $a was coded 'engfre'." Should there be conditions for these situations? What were the recording instructions at the time?
Work: language of representative expression. I'm not sure how we can determine a representative expression without human intervention. LC/PCC PS says "Do not record the element." Do we factor LC/PCC PS in the mapping?
1. Several subfields record languages of part of an expression, like $b for summary and $f for table of contents, and RDA does not distinguish between primary and secondary contents of an expression. Options: a. Use unstructured descriptions. b. Record languages as Nomens, and record the content types in Nomen: note on nomen, the values for the notes will be taken from MARC subfield labels.
-Unstructured descriptions, plus less-specific "language of expression" triples, plus "loss" in a status would be my approach here. -CEC 2022-03-16
2. Order of subfields in MARC. $a has an instruction that says "for works in multiple languages, the codes for the languages of the text are recorded in the order of their predominance in the text." RDA does not seem to care about ordering at all. I wonder if MARC order should/could be retained in RDA.
-We could either throw in rdf:List with ordered values, or record values without regard to order and add "loss" to status and ask RSC how they envision this being expressed in RDA. -CEC 2022-03-16
3. Subfields for languages of the original. These belong to related expressions so I assume they should be left out of the mapping. But then in cases where 041 1# $a eng $h ger and 546 ## Translated from the German. are both mapped, the information in the unstructured description and identifier description will not match. And the unstructured description is not really about the languages of the expression being described and seems more suitable as a note on expression.
-Yeah, $h is tough because it might be the representative expression but it also very well might not. If there is a 7xx with relationship designator "translation of", that would give us the entity to which $h refers. But when that doesn't exist explicitly, we don't know exactly what entity the $h refers to. It would be good to have a property for "translated from expression in language" or some such thing. This is another one that's just more specific in MARC21 than RDA/RDF and should maybe be brought to RSC's attention. -CEC 2022-03-16
-"Translated from the German" does seem like a note on the expression rather than about the language of expression, but since it's in a note field designated for notes on language, that seems like a problem with the note itself rather than a mapping problem. @AdamSchiff are notes like "546## $aTranslated from the German." correct in MARC? Should we adjust the mapping for them? -CEC 2022-03-16
4. Obsolete data/subfields. Historically there was "the practice of placing multiple language codes in one subfield," and "an item having text in English, translated from French, subfield $a was coded 'engfre'." Should there be conditions for these situations? What were the recording instructions at the time?
-If these values still exist in OCLC, we would benefit from writing conditions for them. I have no idea what the instructions were or what combinations of values might exist. Maybe @AdamSchiff or @CatalogerCate knows?-CEC 2022-03-16
5. Work: language of representative expression. I'm not sure how we can determine a representative expression without human intervention. LC/PCC PS says "Do not record the element." Do we factor LC/PCC PS in the mapping?
We definitely factor in implementation/practice...if there is no information about representative expressions in the data that can be extracted/mapped, then we can't really map it to rdaw:P10353. If conditions can't be isolated to differentiate representative expression from others, I'd record "delete" for any mappings to this property. -CEC 2022-03-16
One comment below:
Adam L. Schiff Principal Cataloger University of Washington Libraries (206) 543-8409 @.***
From: Crystal Clements @.> Sent: Wednesday, March 16, 2022 11:50 AM To: uwlib-cams/MARC2RDA @.> Cc: Adam L Schiff @.>; Mention @.> Subject: Re: [uwlib-cams/MARC2RDA] 041 language code (Issue #74)
-Unstructured descriptions, plus less-specific "language of expression" triples, plus "loss" in a status would be my approach here. -CEC 2022-03-16
-We could either throw in rdf:List with ordered values, or record values without regard to order and add "loss" to status and ask RSC how they envision this being expressed in RDA. -CEC 2022-03-16
-Yeah, $h is tough because it might be the representative expression but it also very well might not. If there is a 7xx with relationship designator "translation of", that would give us the entity to which $h refers. But when that doesn't exist explicitly, we don't know exactly what entity the $h refers to. It would be good to have a property for "translated from expression in language" or some such thing. This is another one that's just more specific in MARC21 than RDA/RDF and should maybe be brought to RSC's attention. -CEC 2022-03-16
-"Translated from the German" does seem like a note on the expression rather than about the language of expression, but since it's in a note field designated for notes on language, that seems like a problem with the note itself rather than a mapping problem. @AdamSchiffhttps://github.com/AdamSchiff are notes like "546## $aTranslated from the German." correct in MARC? Should we adjust the mapping for them? -CEC 2022-03-16
"Translated from the German." is not correct in a 546 field. It should be in a 500. --ALS
-If these values still exist in OCLC, we would benefit from writing conditions for them. I have no idea what the instructions were or what combinations of values might exist. Maybe @AdamSchiffhttps://github.com/AdamSchiff or @CatalogerCatehttps://github.com/CatalogerCate knows?-CEC 2022-03-16
We definitely factor in implementation/practice...if there is no information about representative expressions in the data that can be extracted/mapped, then we can't really map it to rdaw:P10353. If conditions can't be isolated to differentiate representative expression from others, I'd record "delete" for any mappings to this property. -CEC 2022-03-16
— Reply to this email directly, view it on GitHubhttps://github.com/uwlib-cams/MARC2RDA/issues/74#issuecomment-1069470333, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADFBVB7Z7OVZRNAO3FPIVTTVAIUPFANCNFSM5IXJIBFA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you were mentioned.Message ID: @.***>
I used the expert search k546 "translated from the german"
in the LC catalog and got over 1,000 hits. What should we do with mistakes in MARC?
I don't think we can do much. We could report it to LC and tell them they should change the tag, but my understanding is thair system cannot do batch edits and they'd have to fix records one-by-one. I doubt they would care to.
For records that we own, we could possibly create a list of OCLC record numbers and do a batch edit to replace the records in OCLC.
Adam
Adam L. Schiff Principal Cataloger University of Washington Libraries (206) 543-8409 @.***
From: Zhuo Pan @.> Sent: Friday, March 18, 2022 11:45 AM To: uwlib-cams/MARC2RDA @.> Cc: Adam L Schiff @.>; Mention @.> Subject: Re: [uwlib-cams/MARC2RDA] 041 language code (Issue #74)
I used the expert search k546 "translated from the german" in the LC catalog and got over 1,000 hits. What should we do with mistakes in MARC?
— Reply to this email directly, view it on GitHubhttps://github.com/uwlib-cams/MARC2RDA/issues/74#issuecomment-1072701565, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADFBVBYPUTGUVJDMEMUDZW3VATFM3ANCNFSM5IXJIBFA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you were mentioned.Message ID: @.***>
I don't think we need to worry about correcting legacy data, but also don't think we need to accommodate incorrect MARC in the mapping. Eh?
6.1 Parallel aggregates (manifestations that embody two or more expressions of a single work). This example is taken from https://lccn.loc.gov/2016047962.
041 1_ |a eng |a baq |h eng
546 __ |a A bilingual edition with parallel text in English and Basque.
700 12 |a Shakespeare, William, |d 1564-1616. |t Macbeth.
700 12 |a Shakespeare, William, |d 1564-1616. |t Macbeth. |l Basque.
LC-PPC PS says not to identify the aggregating expression of a parallel aggregate (95.99.14.22), so I think we should relate the manifestation separately to the English expression and the Basque expression. My question is how do we match the codes in 041 with their corresponding expressions? Or we don't do the mapping of 041 in this case and use the structured description in 7XX $l. 546 is also problematic. It does not describe the language of any of the expressions being aggregated. Nor does it describe the languages of the aggregating expression because 'an aggregating expression does not incorporate or accumulate the expressions that are aggregated (88.69.69.51).' It seems that it could either be a note on expression (39.16.08.94) or a note on manifestation (73.81.62.04). Wouldn't the former mean that we need to identify the aggregating expression, which is against LC-PPC PS? (But then again, the PS at 39.16.08.94 says 'cataloger's judgment for augmenting or parallel content,' so I'm confused.)
6.2 Augmentation aggregates (manifestation that embody two or more expressions of two or more works, where one work is supplemented by one or more other works). Many of the 041 subfields describe the languages of supplementary contents. LC-PPC PS treats augmentation aggregates similarly to parallel aggregates. Then only the codes in $a could be mapped to the languages of the primary expression. If we want to map the subfields for supplementary contents, it seems that we need to establish them as expressions. (And also create the corresponding works, as required by the minimum description of an expression?) 546 has the same problems as stated in 6.1. But in the case of augmentation aggregates, another possibility I see is to apply the option 09.52.01.48 and map 546 to Manifestation: supplementary content as unstructured descriptions.
6.3 Collection aggregates (manifestations that embody two or more expressions of two or more independent works). LC-PCC PS: It is not necessary to identify expressions in anthologies of poetry, hymnals, conference proceedings, journals, collections of interviews or letters, and similar resources. I would assume these collections are more likely to be multilingual than others? I don't know how 041 can be mapped here.
-If these values still exist in OCLC, we would benefit from writing conditions for them. I have no idea what the instructions were or what combinations of values might exist. Maybe @AdamSchiffhttps://github.com/AdamSchiff or @CatalogerCatehttps://github.com/CatalogerCate knows?-CEC 2022-03-16
There should not be any of these. OCLC cleaned them all up years ago.
Adam L. Schiff Principal Cataloger University of Washington Libraries Box 352900 Seattle, WA 98195-2900 aschiff @ uw.edu
From: Zhuo Pan @.> Sent: Tuesday, March 22, 2022 10:30:27 PM To: uwlib-cams/MARC2RDA @.> Cc: Adam L Schiff @.>; Mention @.> Subject: Re: [uwlib-cams/MARC2RDA] 041 language code (Issue #74)
6.1 Parallel aggregates (manifestations that embody two or more expressions of a single work). This example is taken from https://lccn.loc.gov/2016047962.
041 1_ |a eng |a baq |h eng 546 __ |a A bilingual edition with parallel text in English and Basque. 700 12 |a Shakespeare, William, |d 1564-1616. |t Macbeth. 700 12 |a Shakespeare, William, |d 1564-1616. |t Macbeth. |l Basque.
LC-PPC PS says not to identify the aggregating expression of a parallel aggregate (95.99.14.22), so I think we should relate the manifestation separately to the English expression and the Basque expression. My question is how do we match the codes in 041 with their corresponding expressions? Or we don't do the mapping of 041 in this case and use the structured description in 7XX $l. 546 is also problematic. It does not describe the language of any of the expressions being aggregated. Nor does it describe the languages of the aggregating expression because 'an aggregating expression does not incorporate or accumulate the expressions that are aggregated (88.69.69.51).' It seems that it could either be a note on expression (39.16.08.94) or a note on manifestation (73.81.62.04). Wouldn't the former mean that we need to identify the aggregating expression, which is against LC-PPC PS? (But then again, the PS at 39.16.08.94 says 'cataloger's judgment for augmenting or parallel content,' so I'm confused.)
6.2 Augmentation aggregates (manifestation that embody two or more expressions of two or more works, where one work is supplemented by one or more other works). Many of the 041 subfields describe the languages of supplementary contents. LC-PPC PS treats augmentation aggregates similarly to parallel aggregates. Then only the codes in $a could be mapped to the languages of the primary expression. If we want to map the subfields for supplementary contents, it seems that we need to establish them as expressions. (And also create the corresponding works, as required by the minimum description of an expression?) 546 has the same problems as stated in 6.1. But in the case of augmentation aggregates, another possibility I see is to apply the option 09.52.01.48 and map 546 to Manifestation: supplementary content as unstructured descriptions.
6.3 Collection aggregates (manifestations that embody two or more expressions of two or more independent works). LC-PCC PS: It is not necessary to identify expressions in anthologies of poetry, hymnals, conference proceedings, journals, collections of interviews or letters, and similar resources. I would assume these collections are more likely to be multilingual than others? I don't know how 041 can be mapped here.
— Reply to this email directly, view it on GitHubhttps://github.com/uwlib-cams/MARC2RDA/issues/74#issuecomment-1075930263, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADFBVB44VZIMHXB4XSXCFC3VBKT7HANCNFSM5IXJIBFA. You are receiving this because you were mentioned.Message ID: @.***>
Yeah, $h is tough because it might be the representative expression but it also very well might not.
Language of representative expression is not coded in 046. New field 387 $h will hold it: https://www.loc.gov/marc/mac/2022/2022-04.html
Adam
Adam L. Schiff Principal Cataloger University of Washington Libraries Box 352900 Seattle, WA 98195-2900 aschiff @ uw.edu
From: Adam L Schiff @.> Sent: Tuesday, March 22, 2022 10:59:39 PM To: uwlib-cams/MARC2RDA @.>; uwlib-cams/MARC2RDA @.> Cc: Mention @.> Subject: Re: [uwlib-cams/MARC2RDA] 041 language code (Issue #74)
-If these values still exist in OCLC, we would benefit from writing conditions for them. I have no idea what the instructions were or what combinations of values might exist. Maybe @AdamSchiffhttps://github.com/AdamSchiff or @CatalogerCatehttps://github.com/CatalogerCate knows?-CEC 2022-03-16
There should not be any of these. OCLC cleaned them all up years ago.
Adam L. Schiff Principal Cataloger University of Washington Libraries Box 352900 Seattle, WA 98195-2900 aschiff @ uw.edu
From: Zhuo Pan @.> Sent: Tuesday, March 22, 2022 10:30:27 PM To: uwlib-cams/MARC2RDA @.> Cc: Adam L Schiff @.>; Mention @.> Subject: Re: [uwlib-cams/MARC2RDA] 041 language code (Issue #74)
6.1 Parallel aggregates (manifestations that embody two or more expressions of a single work). This example is taken from https://lccn.loc.gov/2016047962.
041 1_ |a eng |a baq |h eng 546 __ |a A bilingual edition with parallel text in English and Basque. 700 12 |a Shakespeare, William, |d 1564-1616. |t Macbeth. 700 12 |a Shakespeare, William, |d 1564-1616. |t Macbeth. |l Basque.
LC-PPC PS says not to identify the aggregating expression of a parallel aggregate (95.99.14.22), so I think we should relate the manifestation separately to the English expression and the Basque expression. My question is how do we match the codes in 041 with their corresponding expressions? Or we don't do the mapping of 041 in this case and use the structured description in 7XX $l. 546 is also problematic. It does not describe the language of any of the expressions being aggregated. Nor does it describe the languages of the aggregating expression because 'an aggregating expression does not incorporate or accumulate the expressions that are aggregated (88.69.69.51).' It seems that it could either be a note on expression (39.16.08.94) or a note on manifestation (73.81.62.04). Wouldn't the former mean that we need to identify the aggregating expression, which is against LC-PPC PS? (But then again, the PS at 39.16.08.94 says 'cataloger's judgment for augmenting or parallel content,' so I'm confused.)
6.2 Augmentation aggregates (manifestation that embody two or more expressions of two or more works, where one work is supplemented by one or more other works). Many of the 041 subfields describe the languages of supplementary contents. LC-PPC PS treats augmentation aggregates similarly to parallel aggregates. Then only the codes in $a could be mapped to the languages of the primary expression. If we want to map the subfields for supplementary contents, it seems that we need to establish them as expressions. (And also create the corresponding works, as required by the minimum description of an expression?) 546 has the same problems as stated in 6.1. But in the case of augmentation aggregates, another possibility I see is to apply the option 09.52.01.48 and map 546 to Manifestation: supplementary content as unstructured descriptions.
6.3 Collection aggregates (manifestations that embody two or more expressions of two or more independent works). LC-PCC PS: It is not necessary to identify expressions in anthologies of poetry, hymnals, conference proceedings, journals, collections of interviews or letters, and similar resources. I would assume these collections are more likely to be multilingual than others? I don't know how 041 can be mapped here.
— Reply to this email directly, view it on GitHubhttps://github.com/uwlib-cams/MARC2RDA/issues/74#issuecomment-1075930263, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADFBVB44VZIMHXB4XSXCFC3VBKT7HANCNFSM5IXJIBFA. You are receiving this because you were mentioned.Message ID: @.***>
The policy statements surrounding aggregates are very complex, and taken outside of the context given by the guidance documentation are totally unintelligible. The RDA Toolkit guidance on identifying expressions of aggregating works and expressions that are aggregated clarifies how to implement policy statements a bit. Continuing with your example,
041 1_ |a eng |a baq |h eng 546 __ |a A bilingual edition with parallel text in English and Basque. 700 12 |a Shakespeare, William, |d 1564-1616. |t Macbeth. 700 12 |a Shakespeare, William, |d 1564-1616. |t Macbeth. |l Basque.
The policy statement next to the option to "Relate the expression that is aggregated to the expression of the aggregating work using Expression: aggregated by" says "Cataloger's judgment. Use the relationship when relating an analytic description to the expression of an aggregating work." This covers the information in the 700 fields in the example. The option that says "Record information about the expression of the aggregating work using Expression: note on expression" is (I think) where the information in the 041 belongs. I could be totally off-base but I think I'm reading the guidance correctly.
I think the LRM/RDA/RDF created according to policy statements would look like this:
[AggregatingWork] has note on work "A bilingual edition with parallel text in English and Basque." [AggregatingExpression] has work expressed [AggregatingWork] [AggregatingExpression] has note on expression "A bilingual edition with parallel text in English and Basque." [AggregatingExpression] has note on expression "Language of text/sound track or separate title: English." [AggregatingExpression] has note on expression "Language of text/sound track or separate title: Basque." [AggregatingExpression] has note on expression "Language of original: English." [AggregatedExpression1] has authorized access point for expression [Nomen1] [Nomen1] has nomen string "Shakespeare, William, 1564-1616. Macbeth" [AggregatedExpression1] aggregated by [AggregatingExpression] [AggregatedExpression2] has authorized access point for expression [Nomen2] [Nomen2] has nomen string "Shakespeare, William, 1564-1616. Macbeth. Basque" [AggregatedExpression2] aggregated by [AggregatingExpression]
A couple of questions:
Is it necessary to differentiate [AggregatingExpression] and [AggregatedExpression] in the transformation notes?
Do we need to make decisions on mapping one type of recording method in MARC to another type in RDA? Like in the case of 041 from identifiers to unstructured descriptions?
Should we match the language code with each expression aggregated?
[AggregatedExpression1] has authorized access point for expression [Nomen1] [Nomen1] has nomen string "Shakespeare, William, 1564-1616. Macbeth" [AggregatedExpression1] aggregated by [AggregatingExpression] [AggregatedExpression1] language of expression "eng"
Are there Work/Expression authorities for aggregating Works/Expressions? Does any of the following count?
These are really good questions. I think, if we're talking about aggregated expressions, we need to differentiate between the aggregating and aggregated expressions in order for our RDA description to work.
Not sure whether we need to formalize decisions on converting codes to unstructured descriptions. For switching from one recording method to another, all we really need to do is record the recording method we want to use in the spreadsheet and ignore the incoming recording method from the MARC. But...in my opinion...it would make sense to map each language code value to a corresponding URI from id.loc, since the codes are included in the linked data descriptions for MARC languages (example: http://id.loc.gov/vocabulary/languages/baq ) and they all follow the pattern http://id.loc.gov/vocabulary/languages/[languageCodeHere] . I think what we would need to do is to create a corresponding expression for each language code, if that makes sense.
Sometimes there are authorities for aggregating Expressions, but not always. It's not required for catalogers to create them, but some do, so the data is really inconsistent. The ones that exist should be controlled in the OCLC records. If we can export MARC with URI's from the controlled headings pulled into appropriate $0/1 subfields, it would be easy to know when to pick them out and when they need to be minted.
I believe OCLC have taken care of all the 041 languages that didn’t have subfielding. If you wanted to know for sure Jay Weitz would probably be able to tell you, but I’m pretty sure those were all cleared up years ago. Can’t remember the last time I saw one, and I look at lots of old bibs!
Cate
Cate Gerhart, Librarian Head, Monographic Cataloging Unit University of Washington Libraries @.*** 206 685-2827 (cell: 206 372-5046)
“The difficult is what takes a little time; the impossible is what takes a little longer.”—Nansen.
From: Crystal Clements @.> Sent: Wednesday, March 16, 2022 11:50 AM To: uwlib-cams/MARC2RDA @.> Cc: Cate Gerhart @.>; Mention @.> Subject: Re: [uwlib-cams/MARC2RDA] 041 language code (Issue #74)
-Unstructured descriptions, plus less-specific "language of expression" triples, plus "loss" in a status would be my approach here. -CEC 2022-03-16
-We could either throw in rdf:List with ordered values, or record values without regard to order and add "loss" to status and ask RSC how they envision this being expressed in RDA. -CEC 2022-03-16
-Yeah, $h is tough because it might be the representative expression but it also very well might not. If there is a 7xx with relationship designator "translation of", that would give us the entity to which $h refers. But when that doesn't exist explicitly, we don't know exactly what entity the $h refers to. It would be good to have a property for "translated from expression in language" or some such thing. This is another one that's just more specific in MARC21 than RDA/RDF and should maybe be brought to RSC's attention. -CEC 2022-03-16
-"Translated from the German" does seem like a note on the expression rather than about the language of expression, but since it's in a note field designated for notes on language, that seems like a problem with the note itself rather than a mapping problem. @AdamSchiffhttps://github.com/AdamSchiff are notes like "546## $aTranslated from the German." correct in MARC? Should we adjust the mapping for them? -CEC 2022-03-16
-If these values still exist in OCLC, we would benefit from writing conditions for them. I have no idea what the instructions were or what combinations of values might exist. Maybe @AdamSchiffhttps://github.com/AdamSchiff or @CatalogerCatehttps://github.com/CatalogerCate knows?-CEC 2022-03-16
We definitely factor in implementation/practice...if there is no information about representative expressions in the data that can be extracted/mapped, then we can't really map it to rdaw:P10353. If conditions can't be isolated to differentiate representative expression from others, I'd record "delete" for any mappings to this property. -CEC 2022-03-16
— Reply to this email directly, view it on GitHubhttps://github.com/uwlib-cams/MARC2RDA/issues/74#issuecomment-1069470333, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AVX7HCWH7JKV2Q4YSV2SQHTVAIUPFANCNFSM5IXJIBFA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you were mentioned.Message ID: @.**@.>>
@GordonDunsire Hi Gordon, could you please give some advice on mapping of MARC 041?
For 041 (Language Code), after many discussions, all subfields are mapped to language of expression with the recording method being identifier, which has led to a significant loss of meaning. For example.
1, For subfield $a, for works in multiple languages, the codes for the languages of the text are recorded in the order of their predominance in the text. How can we map this order of predominance?
2, When a work is a translation or includes a translation, the code for the language of the translation is recorded in subfield $a. The code(s) for the language(s) of the original work are recorded in subfield $h. Should we mint another expression or use a note on expression to retain this information? The same problem exists for accompanying material.
3, And The first indicator contains important information:
4, For subfields $2 (Source of code) and $3 (Materials specified), should we use a note on expression or mint a nomen?
@pennylenger: We can (currently) only process this field for a manifestation that is not an aggregate; that is, where there is only one embodied expression. We have not identified a safe means of mapping a language to the correct aggregated expression aggregate, and an aggregating expression has no language.
If the manifestation being described in the record embodies only one expression, I think many of the subfields do not apply. I think it is better to use the value of 008/35-37 as the language of expression, so we should only process 041 if there is no 008 value.
Answers:
*** The RSC Technical Working Group is likely to add object properties for attribute elements real soon now, and certainly before the transform is published. I recommend that the transform assumes this will happen, and uses object properties for this and similar attributes (e.g. 336, 337, 338).
@GordonDunsire Hi, Gordon, thank you very much. I am not sure which subfields do not apply and I am struggling with mapping following subfields. Could you please take a look at them and see whether they are mappable at our current stage.
$b - Language code of summary or abstract and $f - Language code of table of contents They are part of the express being described
$d - Language code of sung or spoken text They are for the audible portion of the item, usually the sung or spoken content of a sound recording or computer file. They seem important. And they can also be audible portion of an item which does not represent the primary content.
$e - Language code of librettos They are codes of the printed text of the vocal/textual content of the work.
$g - Language code of accompanying material other than librettos and transcripts It is clear these are code of accompanying material.
$h - Language code of original They are related expression.
$i - Language code of intertitles $j - Language code of subtitles
$k - Language code of intermediate translations
$m - Language code of original accompanying materials other than librettos like $g accompanying materials
$n - Language code of original libretto They are related expression.
$p - Language code of captions not sure whether they are accompanying material
$q - Language code of accessible audio not sure whether they are accompanying material
$r - Language code of accessible visual language (non-textual) not sure whether they are accompanying material
$t - Language code of accompanying transcripts for audiovisual materials They are for accompanying transcripts
These subfields' labels carry important information. Can we keep them as note on expression?
@pennylenger: I don't think it is possible to determine which expression this data is associated with. In most cases the subfield definition indicates separate expressions embodied in an aggregate but in some cases the language is associated with a component of the expression (for example the words to a song printed above the notated music - RDA has not yet fully modelled this). I think the safest approach is transform the data as note on manifestation. The note would have to mention 'language encodings' unless we want to look-up the full form on the fly (not recommended).
Hi @pennylenger I am reviewing 041 and I had a question about the mappings for subfield d (language of sung or spoken text) . The 2 mappings that meet the condition, leader byte 6 = j musical sound recording, it maps to Expression properties, and captures/ uses URIs for the language codes if LC or ISO are identified, in addition to a note on expression, but in the third mapping without the condition, the mapping is just to note on manifestation. I don't understand how if it's a musical sound recording we can assume the mapping is at expression level, but for any other material type, manifestation level is used. The note under problem with mapping says "The audible portion of an item does not represent the primary content". Not sure why that matters.
Other than that, the mapping looks ok to me after reading the discussion here - I wish we could do more with some of these language codes, particularly language of subtitles or intertitles which are so important to moving image files, and language of the original or intermediate language for translations. For the latter, maybe we can hope that a more user-understandable note also exists in the record, or even a 765 link to the original. But I think this would need to be a "phase 2" consideration. Most library discovery systems don't do a good job with these codes anywy...
Hi Laura, Thank you very much for your review. My thought was that if it is musical sound recording, $d - Language code of sung or spoken text represents the primary content and can be mapped to the language of expression. However, in cases where the audible portion of the item does not represent the primary content, the $d is just a note on manifestation like many other fields because it is not possible to determine which specific expression this data corresponds to, but not the (primary) expression.
Thanks Penny, I understand the reasoning now. I have almost finished reviewing, but want to ask, since we are going ahead and capturing the language codes in notes, why not add "source of code: " followed by the source based on indicator 2 (https://www.loc.gov/marc/languages/ or the source code in $2) to the note? This might get to what Gordon mentioned as "language encodings" - in that we don't want to look up the full form on the fly, but if someone else was so inclined, they would have the code source to do so.
Yes, I can add this like: "source of code: " https://www.loc.gov/marc/languages/ or the source code in $2.
https://github.com/uwlib-cams/MARC2RDA/blob/main/Working%20Documents/0XX.csv