ntra00 / marc2bibframe

Convert marc to BIBFRAME 1.0 - see lcnetdev/marc2bibframe2 for current release
http://www.loc.gov/bibframe/
Other
64 stars 20 forks source link

Language qualifiers on parallel script fields #162

Open kiegel opened 10 years ago

kiegel commented 10 years ago

The language qualifiers (e.g. @en) seem wrong on fields with non-roman scripts.

For example:

246 31 |a Liang nong zu zhi tong ji nian jian 31 |a 粮农组织统计年鉴 246 1 |i Arabic title on cover and Arabic t.p.: |a Kitāb al-iḥṣāʼī al-sanawī li-Munaẓẓamat al-aghdhīyah wa-al-zirāʻah 1 |i Aratic title on cover and Arabic t.p..: |a كتاب الاحصائي السنوي لمنظمة الاغذية والزراعة

becomes:

<http://example.org/99127025400001452title40> a bf:Title ;
    bf:titleValue "Al-Kitāb al-iḥṣāʼī al-sanawī li-munaẓẓamat al-aghdhīyah wa-al-zirāʻat",
        "الكتاب الاحصائي السنوي لمنظمة الاغذية والزراعة"@en-arab .

<http://example.org/99127025400001452title41> a bf:Title ;
    bf:subtitle "Annuaire statistique de la FAO = Anuario estadístico de la FAO = Liang nong zu zhi tong ji nian jian." ;
    bf:titleValue "FAO statistical yearbook ",
        "FAO statistical yearbook = Annuaire statistique de la FAO = Anuario estadístico de la FAO = 粮农组织统计年鉴."@en .

(LCCN 2005234186)

The field with Arabic script is qualified @en-arab. We're not sure how to read this, but it seems to mean "English in Arabic script". The field with Chinese is qualified @en, which is accurate for most of the field but not all of it.

We note that the 066 can be a clue to the scripts:

066 __ |c (3 |c $1

but in this case it is difficult because there are two different non-roman scripts in one record.

kiegel commented 10 years ago

Non-roman script fields for parallel titles get an incorrect qualifier, e.g.

<http://example.org/99110984870001452title22> a bf:Title ;
    bf:subtitle "latyshskie narodnye skazki o zhivotnykh" ;
    bf:titleType "parallel" ;
    bf:titleValue "Zai︠a︡t︠s︡ i ego druzʹi︠a︡ :",
        "Заяц и его друзья латышские народные сказки о животных"@lv-cyrl .

(OCLC # 893875561)

The book is in Latvian, hence the language code for lv, and the parallel title is in Russian. But the parallel title is not Latvian in Cyrillic script.

Unfortunately, the MARC format does not encode the language of parallel titles.

kiegel commented 10 years ago

Another case where problems arise is translations, e.g. a Chinese translation of a Japanese author. The non-roman script for the Japanese author is in Japanese, not Chinese, but we get:

<http://example.org/99131426860001452person10> a bf:Person ;
    bf:authorizedAccessPoint "Inō, Kanori, 1867-1925",
        "伊能嘉矩, 1867-1925"@zh-hani ;
    bf:hasAuthority [ a madsrdf:Authority ;
            madsrdf:authoritativeLabel "Inō, Kanori, 1867-1925" ] ;
    bf:label "Inō, Kanori, 1867-1925" .

This can happen for titles as well:

<http://example.org/99131426860001452title6> a bf:Title ;
    bf:titleValue "Inō Kanori no Taiwan tōsa nikki.",
        "伊能嘉矩の臺湾踏柤日記."@zh-hani .

(OCLC # 793950140)

kiegel commented 10 years ago

The qualifier for Chinese can vary from record to record: sometimes @zh and other times @zh-hani. This seems inconsistent and we don't understand what reason could be behind it.

For example,

<http://example.org/99161784508601452title29> a bf:Title ;
    bf:titleValue "Yu Yingshi wen ji",
        "余英时文集"@zh .

(OCLC # 891156210)

<http://example.org/99131426860001452title34> a bf:Title ;
    bf:titleValue "Taiwan ta cha ri ji",
        "台灣踏查日記"@zh-hani .

(OCLC # 793950140)

kiegel commented 10 years ago

A language qualifier is added to dates, which seems wrong ( "2014-"@zh ).

264 _1 |a Guilin Shi : |b Guangxi shi fan da xue chu ban she, |c 2014- _1 |a 桂林市 : |b 广西师范大学出版社, |c 2014-

bf:publication [ a bf:Provider ;
            bf:providerDate "2014-",
                "2014-"@zh ;
            bf:providerName [ a bf:Organization ;
                    bf:label "Guangxi shi fan da xue chu ban she",
                        "广西师范大学出版社"@zh ] ;
            bf:providerPlace [ a bf:Place ;
                    bf:label "Guilin Shi ",
                        "桂林市 :"@zh ] ] ;

(OCLC # 891156210)