interscript / interscript-ruby

Interoperable script conversion systems (ISCS) with the `interscript` gem
Other
11 stars 30 forks source link

Add validation on authority codes, script codes and language codes #368

Closed ronaldtse closed 3 years ago

ronaldtse commented 4 years ago

Script codes must be ISO 15924 codes.

Language codes must be ISO 639-X codes. Each code should be namespaced to the actual standard number.

Filename must be made of pattern {authority-code}-{language-code}-{source-script-code}-{target-script-code}-{id}.yaml.

Only a number of authority codes are allowed:

acadsin:
  code: acadsin
  name:
    en: Academia Sinica
ahl:
  code: ahl
  name:
    en: The Academy of the Hebrew Language
alalc:
  code: alalc
  name:
    en: American Library Association -- Library of Congress
ammi:
  code: ammi
  name:
    en: Afghanistan Ministry of Mines and Industries
ansi:
  code: ansi
  name:
    en: American National Standards Institute
apcbg:
  code: apcbg
  name:
    en: Antarctic Place-names Commission of Bulgaria
asm:
  code: asm
  name:
    en: Academy of Sciences of Moldova
az:
  code: az
  name:
    en: Azerbijian Government
bas:
  code: bas
  name:
    en: Bulgarian Academy of Sciences
bds:
  code: bds
  name:
    en: Bulgarian Institute for Standardization
bgn:
  code: bgn
  name:
    en: United States Board on Geographic Names
bgna:
  code: bgna
  name:
    en: National Assembly of the Republic of Bulgaria
bgnpcgn:
  code: bgnpcgn
  name:
    en: United States Board on Geographic Names -- Permanent Committee on Geographical
      Names for British Official Use
bis:
  code: bis
  name:
    en: Bureau of Indian Standards
biulo:
  code: biulo
  name:
    en: Bibliothèque interuniversitaire des langues orientales
bsi:
  code: bsi
  name:
    en: British Standards Institution
bt:
  code: bt
  name:
    en: Royal Government of Bhutan
bulac:
  code: bulac
  name:
    en: Bibliothèque universitaire des langues et civilisations
by:
  code: by
  name:
    en: Government of Belarus
cn:
  code: cn
  name:
    en: Government of China
cnt:
  code: cnt
  name:
    en: Lao Commission Nationale de Toponymie
din:
  code: din
  name:
    en: German Institute for Standardization
dmg:
  code: dmg
  name:
    en: Deutsche Morgenländische Gesellschaft
dos:
  code: dos
  name:
    en: Survey Department, Ministry of Land Management, Cooperatives and Poverty
      Alleviation, Government of Nepal
easc:
  code: easc
  name:
    en: Euro-Asian Council for Standardization, Metrology and Certification
efeo:
  code: efeo
  name:
    en: École française d'Extrême-Orient
elot:
  code: elot
  name:
    en: Hellenic Organization for Standardization
gaz:
  code: gaz
  name:
    en: Azeri Government
ggg:
  code: ggg
  name:
    en: Georgian State Department of Geodesy and Cartography
gki:
  code: gki
  name:
    en: State Committee on Property of the Republic of Belarus
gost:
  code: gost
  name:
    en: Rosstandart
gsi:
  code: gsi
  name:
    en: Geospatial Information Authority of Japan
hk:
  code: hk
  name:
    en: Hong Kong Government
icao:
  code: icao
  name:
    en: International Civil Aviation Organization
ign:
  code: ign
  name:
    en: Institut Geographique Nationale
iso:
  code: iso
  name:
    en: International Organization for Standardization
itk:
  code: itk
  name:
    en: Inuit Tapiriit Kanatami
jp:
  code: jp
  name:
    en: Government of Japan
jra:
  code: jra
  name:
    en: Japan Road Association
kp:
  code: kp
  name:
    en: Democratic People's Republic of Korea
lbmod:
  code: lbmod
  name:
    en: Lebanese Republic Ministry of National Defense
lshk:
  code: lshk
  name:
    en: Linguistic Society of Hong Kong
ma:
  code: ma
  name:
    en: Kingdom of Morocco
md:
  code: md
  name:
    en: Republic of Moldova
mext:
  code: mext
  name:
    en: Ministry of Education, Culture, Sports, Science and Technology -- Japan
mk:
  code: mk
  name:
    en: Republic of North Macedonia
mlc:
  code: mlc
  name:
    en: Myanmar Language Commission
mlit:
  code: mlit
  name:
    en: Ministry of Land, Infrastructure, Transport and Tourism of Japan
mlmupc:
  code: mlmupc
  name:
    en: The Ministry of Land Management, Urban Planning and Construction of Cambodia
moct:
  code: moct
  name:
    en: Korean Ministry of Culture and Tourism
mofa:
  code: mofa
  name:
    en: Ministry of Foreign Affairs of Japan
msst:
  code: msst
  name:
    en: The Major State Service "Turkmenstandartlary"
mv:
  code: mv
  name:
    en: Republic of Maldives
nco:
  code: nco
  name:
    en: National Cartographic Center of Iran
nikl:
  code: nikl
  name:
    en: National Institute of Korean Language
nrs:
  code: nrs
  name:
    en: Nippon-no-Rômazi-Sya
  notes: Also known as the Japan Romanization Society.
odni:
  code: odni
  name:
    en: Office of the Director Of National Intelligence
rjgc:
  code: rjgc
  name:
    en: Royal Jordanian Geographic Center
royin:
  code: royin
  name:
    en: The Royal Society of Thailand
  notes: Formerly named The Royal Institute of Thailand (royin)
rs:
  code: rs
  name:
    en: Republic of Serbia
sac:
  code: sac
  name:
    en: Standardization Administration of China
ses:
  code: ses
  name:
    en: Survey of Egypt
sfs:
  code: sfs
  name:
    en: Finnish Standards Association
sgk:
  code: sgk
  name:
    en: Khmere Service Geographique
tm:
  code: tm
  name:
    en: Republic of Turkmenistan
ua:
  code: ua
  name:
    en: Government of Ukraine
ucis:
  code: ucis
  name:
    en: Uyghur Computer Information Society
un:
  code: un
  name:
    en: United Nations
uz:
  code: uz
  name:
    en: Government of Uzbekistan
var:
  code: var
  name:
    en: Various systems managed by ISO {docnumber}/AG
xlsc:
  code: xlsc
  name:
    en: XUAR Language and Script Committee
yivo:
  code: yivo
  name:
    en: YIVO Institute for Jewish Research
ronaldtse commented 4 years ago

We have to add:

mvd:
  code: mvd
  name:
    en: The Ministry of Internal Affairs of the Republic of Belarus

stategeocadastre:
  code: stategeocadastre
  name:
    en: State Service of Ukraine for Geodesy, Cartography and Cadastre (StateGeoCadastre)
webdev778 commented 4 years ago

I'll add it

webdev778 commented 4 years ago

PR #470

webdev778 commented 4 years ago

I'm considering whether we will have the yaml files for these codes of auth, lang, script in the codebase

ronaldtse commented 4 years ago

For authorities codes, let’s keep the YAML in this codebase. For script and Lang codes, we can depend on gems or just keep the YAML here.

webdev778 commented 3 years ago

I think it's better to maintain these gems by interscript as referring a lot from many repos of interscript.

ronaldtse commented 3 years ago
  • Auth code Stored Auth code list from #368 as auth_codes.yaml in the codebase.

Can you rename to authority_codes.yaml? auth sounds like authentication.

  • Lang code Found iso-639 but it's not supported ISO 639-3 code. The issue for this has been posted but seems like no maintainers. xwmx/iso-639#6

You can add ISO 639-3 codes to this repo: https://github.com/metanorma/iso-639-codes

There are also codes from ISO 639-5, those also need to be added. I don't foresee these codes being added to that particular gem.

This is the file from Unicode, who manages the ISO 15924 list: iso15924-utf8-20200424.txt

We can have a gem for iso-15924.

I think it's better to maintain these gems by interscript as referring a lot from many repos of interscript.

Agree.

webdev778 commented 3 years ago

Ok, I'll fix the name. I'll add ISO 639-3 codes to the repo https://github.com/metanorma/iso-639-codes I'm wondering if we will create a gem for ISO 639-X.

ronaldtse commented 3 years ago

Let's make a gem for iso-639-data then.

webdev778 commented 3 years ago

Added 2 new authoritiy codes.

sasm:
  code: sasm
  name:
    en: The former State Administration of Surveying and Mapping of the People's Republic of China
mns:
  code: mns
  name:
    en: Standard of Mongolia

here I'm not pretty sure about what MNS stands for.

ronaldtse commented 3 years ago

MNS should actually be MASM, Mongolian Agency for Standardization and Metrology.

webdev778 commented 3 years ago

PR #612

ronaldtse commented 3 years ago

Thanks @webdev778 , #612 merged.