cldf-clts / clts-legacy

Cross-Linguistic Transcription Systems
Apache License 2.0
4 stars 3 forks source link

acoustic cross-linguistic vowel corpus #117

Closed LinguList closed 4 years ago

LinguList commented 5 years ago

This corpus is difficult to find, but many scholars seem to use it (?).

see for example here: https://www.biorxiv.org/content/biorxiv/early/2018/06/14/346965.full.pdf

and here: https://github.com/soskuthy/u-fronting

If the data is in any way available in IPA, we could/should add it.

thiagochacon commented 5 years ago

The inventory of vowels in the study appears in the appendix, classified in a set of inventories according to the formant frequencies and dispersion on the vocal tract. The appendix is attached for reference. The transcriptions are narrow to the extent they reflect normalized formant frequencies of vowels of each language in a contrastive cross-linguistic approach. It uses IPA diacritics in order to differentiate vowels, as the author puts:

"Standard IPA diacritics are used here in narrow transcriptions, denoting mostly raising, lowering, fronting, retraction and mid-centralization (e.g. [ʊ̝ ʊ̞ ʊ̟ ʊ̠ ʊ]̽ respectively). These diacritics usually indicate F1 deviations of 30-50Hz and F2 deviations of 100-150Hz relative to the characteristic formant frequencies of IPA vowels. For example, assuming that [ʊ] has the characteristic formant frequencies: F1 = 375Hz, F2 = 1000Hz, then [ʊ]̞ denotes a vowel around F1 = 410Hz, F2 = 1000Hz, [ʊ]̟ denotes a vowel aroundF1=375Hz,F2=1125Hz,and[ʊ]̽ denotesavowelaroundF1=410Hz,F2=1125Hz."

On the other hand, the distinction between fron round and unround are not systematically captured in the inventories.

If somebody could extract the data in the appendix, I guess we could use it as a nice illustration of narrow IPA with articulatory diacritics. Becker-Kristal 2010 Appendix.pdf

tresoldi commented 5 years ago

Raw extraction here (sorry, XLSX because GitHub does not allow most file types here). Graphemes should be reviewed and language codes in the second sheet would better be linked to Glottolog. becker_kristal.xlsx

LinguList commented 5 years ago

So do we add this for the release of CLTS? If so, who will look into this?

xrotwang commented 4 years ago

See https://github.com/cldf-clts/clts/issues/2