Open rsprouse opened 8 months ago
There are yet other patterns, e.g. Chiriguano (ava dialect), ISO: gui
. The name
and iso_codes
fields should probably be checked and corrected by hand.
The languages that have a third part to the string are languages that require a dialect specification. More generally I agree with the point that there needs to a bunch of hand correction done. What would probably be ideal, actually, is to switch over to glottocodes, and get Harald Hammarstöm to add dialects for us. @mlapier What do you think?
Yes, I agree. This is exactly why we were thinking of switching them all to Glottocodes.
Myriam Lapierre Assistant Professor Department of Linguistics University of Washington
I recognize that the University of Washington stands on the lands and shared waters of the Coast Salish Peoples; Duwamish, Puyallup, Suquamish, Tulalip and Muckleshoot nations.
On Thu, Feb 22, 2024 at 11:43 AM levmichael @.***> wrote:
The languages that have a third part to the string are languages that require a dialect specification. More generally I agree with the point that there needs to a bunch of hand correction done. What would probably be ideal, actually, is to switch over to glottocodes, and get Harald Hammarstöm to add dialects for us. @mlapier https://urldefense.com/v3/__https://github.com/mlapier__;!!K-Hz7m0Vt54!i_jdbWNkE-W5w-g7JDoqW8MWJ6jqRLSaOTdxFeg6nlKlA0dlRoCIYkVK6Rs-BuolPCGF3zNVAQJZshEE_SDu5A9_$ What do you think?
— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/levmichael/saphon/issues/72*issuecomment-1960136262__;Iw!!K-Hz7m0Vt54!i_jdbWNkE-W5w-g7JDoqW8MWJ6jqRLSaOTdxFeg6nlKlA0dlRoCIYkVK6Rs-BuolPCGF3zNVAQJZshEE_TSDbAAu$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/A4YYS2INJWMM2IBMJDISKX3YU6NVPAVCNFSM6AAAAABDTZKFCCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRQGEZTMMRWGI__;!!K-Hz7m0Vt54!i_jdbWNkE-W5w-g7JDoqW8MWJ6jqRLSaOTdxFeg6nlKlA0dlRoCIYkVK6Rs-BuolPCGF3zNVAQJZshEE_WZ5oETL$ . You are receiving this because you were mentioned.Message ID: @.***>
In the Tupian nasal typology input spreadsheet some values for
Language
contain strings with the language name and ISO 639-3 values, e.g.Araweté [awt]
, which is parsed as thename
andiso_codes
values in the .json data format.Other languages have a third part of the string. What is the intended meaning of the third part and how should it be parsed? Examples:
Avá-Canoeiro Goiás [avv-gos] (avá-canoeiro goiás)
andAvá-Canoeiro [avv-tct] (avá-canoeiro tocantins)
.