Open pjox opened 5 years ago
Not sure where the problem lies. ‘fre' is a standard code for modern French. We also have words in latin, hence ‘latin’. In the etymologies, I have to correct as we now use
G
Le 5 juil. 2019 à 12:50, Pedro J. Ortiz notifications@github.com a écrit :
Hello,
Looking at the files, I've seen you use xml:lang="fre" instead of xml:lang="fr" or xml:lang="fra". For example in file LettreR_workingFile.xml:
L'Academie écrit Rôle; et
c'eſt aini qu'on doit écrire, pour marquer que la
premiere syllabe eſt longue; ce que l'on marquoit
autrefois en écrivantRoolle.Etat, ou liſte de noms de
pluſieurs perſonnes qui ſont de même condition, ou
dans le même engagement. Dès que le nom d'un ſol-
dat eſt écrit ſur lerôle, c'eſt pour lui un crime capital
de deſerter. Le Comiſſaire à faire les montres tient
lesrôles, arrête lesrôles. On appelle les Ouvriers
dans les ateliers trois fois le jour sur lerôle; on les paye ſui-
vant qu'ils ſont marquez ſur lerôle.Ce mot vient de rutulus ourotulus , qui ſignifie unrouleau,
parce qu'autrefois on rouloit cesrôles, & toutes les ex-I looked it up https://iso639-3.sil.org/code/fre and this code is normally the 639-2/B https://iso639-3.sil.org/code/fre way of tagging the French language. The problem is that, in the same entry there is also
rutulus where the code lat is used, which the 639-2, 639-3 https://iso639-3.sil.org/code/lat way of tagging Latin.This is just a detail and can be corrected without much effort, however it would be nice to use the same standard for all languages, preferring the ISO 639-3 https://iso639-3.sil.org/ standard which uses codes like:
fro: Old French (842-ca. 1400), frm: Middle French (ca. 1400-1600) , fra: French, ang: Old English (ca. 450-1100), enm: Middle English (1100-1500), eng: English, grc: Ancient Greek (to 1453), ell: Modern Greek (1453-), lat: Latin, ron: Moldavian, Moldovan, Romanian. The complete list is here https://iso639-3.sil.org/code_tables/639/data.
Making this change standardizes the language codes and makes it easier for me to automatize this later.
Thanks! 😄
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/WGBS2/Basnage/issues/4?email_source=notifications&email_token=AD63DP7BREYAMLMPVMSVDGTP54RPXA5CNFSM4H6KGXI2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G5Q7MVQ, or mute the thread https://github.com/notifications/unsubscribe-auth/AD63DPZRHVJ42IFPIO4EWE3P54RPXANCNFSM4H6KGXIQ.
The problem is that fre
is an ISO 639-2/B
code, and most of the python libraries I use support either ISO 639-1
, ISO 639-2/T
or ISO 639-3
. So I would always have to change fre
to fra
or to fr
. This is not difficult, but it would be convenient if we all "speak" the same standard (preferably ISO 639-3
which is the standard right now).
Also as stated on Wikipedia:
B and T codes
While most languages are given one code by the standard, twenty of the languages described have two three-letter codes, a "bibliographic" code (ISO 639-2/B), which is derived from the English name for the language and was a necessary legacy feature, and a "terminological" code (ISO 639-2/T), which is derived from the native name for the language and resembles the language's two-letter code in ISO 639-1. There were originally 22 B codes; scc and scr are now deprecated.
In general the T codes are favored; ISO 639-3 uses ISO 639-2/T.
So the ISO 639-2/T
which is fra
is compatible with the ISO 639-3
which is also fra
. The ISO 639-2/B
fre
is not compatible with any other standard.
Hi Pedro,
I am worried about standards that change whenever certain people get an itch. Very many projects have been using 639-2. I use a three character code for all languages so as to be consistent. If the new standard is ‘fra’, thereby removing the choice, then so be it. Let’s go for 639-3, but we’ll have toi change things everywhere.
Our Github works fine, but I’ll look at organisation when i get time. Now that Github has been brought up by the enemy, I am much more careful.
Best wishes
Geoffrey
Not oin holiday, but not much online either. I’ll shut down totally when the granddaughters arrive at the end of the month.
Le 5 juil. 2019 à 14:23, Pedro J. Ortiz notifications@github.com a écrit :
The problem is that fre is an ISO 639-2/B code, and most of the python libraries I use support either ISO 639-1, ISO 639-2/T or ISO 639-3. So I would always have to change fre to fra or to fr. This is not difficult, but it would be convenient if we all "speak" the same standard (preferably ISO 639-3 which is the standard right now).
Also as stated on Wikipedia https://en.wikipedia.org/wiki/ISO_639-2#B_and_T_codes:
B and T codes
While most languages are given one code by the standard, twenty of the languages described have two three-letter codes, a "bibliographic" code (ISO 639-2/B), which is derived from the English name for the language and was a necessary legacy feature, and a "terminological" code (ISO 639-2/T), which is derived from the native name for the language and resembles the language's two-letter code in ISO 639-1. There were originally 22 B codes; scc and scr are now deprecated.
In general the T codes are favored; ISO 639-3 uses ISO 639-2/T.
So the ISO 639-2/T which is fra is compatible with the ISO 639-3 which is also fra. The ISO 639-2/B fre is not compatible with any other standard.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/WGBS2/Basnage/issues/4?email_source=notifications&email_token=AD63DP6ECCW7QF75IDQQGWTP544LVA5CNFSM4H6KGXI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZJMPYI#issuecomment-508741601, or mute the thread https://github.com/notifications/unsubscribe-auth/AD63DP45AVBBM5FMFBO2OXTP544LVANCNFSM4H6KGXIQ.
Hello,
Looking at the files, I've seen you use
xml:lang="fre"
instead ofxml:lang="fr"
orxml:lang="fra"
. For example in fileLettreR_workingFile.xml
:I looked it up and this code is normally the 639-2/B way of tagging the French language. The problem is that, in the same entry there is also
<foreign xml:lang="lat" rendition="#i">rutulus</foreign>
where the codelat
is used, which the 639-2, 639-3 way of tagging Latin.This is just a detail and can be corrected without much effort, however it would be nice to use the same standard for all languages, preferring the ISO 639-3 standard which uses codes like:
fro
: Old French (842-ca. 1400),frm
: Middle French (ca. 1400-1600) ,fra
: French,ang
: Old English (ca. 450-1100),enm
: Middle English (1100-1500),eng
: English,grc
: Ancient Greek (to 1453),ell
: Modern Greek (1453-),lat
: Latin,ron
: Moldavian, Moldovan, Romanian.The complete list is here.
Making this change standardizes the language codes and makes it easier for me to automatize this later.
Thanks! 😄