digling / tukano-project

Repository for the Tukano project (discussions and automatic data analyses)
GNU General Public License v3.0
0 stars 0 forks source link

Tukano minor issues #16

Open nataliacp opened 8 years ago

nataliacp commented 8 years ago

Seb and I found the following weird symbols in Tukano. For the meaning "much" on the last entry there is a tone mark on an r. This is the only instance of this combination in the whole spreadsheet, I presume a typo. In the same cell, in the third entry there is a ũû vowel sequence which I found weird, but maybe there is a good reason for it. We didn't do any checks for such sequences, so we have no idea if this is common in the data or not.

amaliaskilton commented 8 years ago

From what I know of the orthography in Ramirez, the ũû sequence is licit. @thiagochacon?

thiagochacon commented 8 years ago

Yes it is a licit sequence: phonetically both vowels (i.e. the long u) are nasal. The tone is used orthographically in the second vowel of a word by Ramirez.

tone marker on an < r > is a typo. the tone should cog with the next vowel on the right

nataliacp commented 8 years ago

ok, I presume this is dealt with with the tone on the r. However, my understanding is that the ũû sequence is within the quasi-phonemic delimiters, so I don't see why the orthographic conventions of the source are relevant. From the description I understand that both vowels are of the same quality and the tone applies to both as a unit. Shouldn't this be represented as ũ̂ũ̂?

thiagochacon commented 8 years ago

We didn't change the way Ramirez represented tones. I can include a description of what < ´ > and < ^ > represent, but I though it would be simpler to leave the system as Ramirez analyzed. (I mean his notation are not IPA not the letter tones we decided to use with some languages)

that being said, ũ̂ũ̂ is not a phonetic or phonological or orthograpic form in Tukano.

Date: Tue, 16 Feb 2016 08:20:24 -0800 From: notifications@github.com To: tukano-project@noreply.github.com CC: thiago_chacon@hotmail.com Subject: Re: [tukano-project] Tukano minor issues (#16)

ok, I presume this is dealt with with the tone on the r.

However, my understanding is that the ũû sequence is within the quasi-phonemic delimiters, so I don't see why the orthographic conventions of the source are relevant. From the description I understand that both vowels are of the same quality and the tone applies to both as a unit. Shouldn't this be represented as ũ̂ũ̂?

— Reply to this email directly or view it on GitHub.

nataliacp commented 8 years ago

I was under the impression that we are using strict IPA for the FUN field. Ultimately, it is not my decision to make, but as you know I am a big proponent of consistency across the dataset. I don't think it is a good idea to have the same symbols meaning different things in different languages.

as for the ũ̂ũ̂ sequence, I intended it to be the quasi-phonemic form according to the conventions we have up to now.

levmichael commented 8 years ago

@thiagochacon, I understand why you decided what you did for the tones, but I think that Natalia raises a good point; it would be most consistent just to have high and low tone marks in the quasi-phonemic representation. Could you give us a quick summary of how Ramirez marks tone?

thiagochacon commented 8 years ago

I understand and am happy to change things on the Tukano data if we decide to.

But notice that I am using Ramirez tone in a more phonemic nature, not phonetic. The way we decided to represent tones phonemically is a little awkward for Tukano, especially for some phonetic and phonemic details.

But if we decide to the transformation from Ramirez to IPA,the transformations would be:

"^" changes to a " ´ " in the same vowel, followed by "L" after the word " ´ " changes to a " ` " in the last vowel, followed by a "HL"

This will not be an exact phonological representation of Tukano tones, but a pretty close one given the standards we have in this project Date: Tue, 16 Feb 2016 08:28:26 -0800 From: notifications@github.com To: tukano-project@noreply.github.com CC: thiago_chacon@hotmail.com Subject: Re: [tukano-project] Tukano minor issues (#16)

I was under the impression that we are using strict IPA for the FUN field. Ultimately, it is not my decision to make, but as you know I am a big proponent of consistency across the dataset. I don't think it is a good idea to have the same symbols meaning different things in different languages.

as for the ũ̂ũ̂ sequence, I intended it to be the quasi-phonemic form according to the conventions we have up to now.

— Reply to this email directly or view it on GitHub.

levmichael commented 8 years ago

I'll probably regret asking :), but what is the way in which this is an inexact representation of the tones?

thiagochacon commented 8 years ago

Let me describe a little bit of Tukano tones and Ramirez's representation: Ramirez analyzes three underlying tones:

  1. "register-tone", represented by ^
  2. "contour tone", represented by  ´
  3. "low tone", with no diacritic

every root has an undelrying tone. a few suffixes have it too. every word has one underlying tone. in compounds, the tone of the left root dominates over the right root.

the "register-tone" in two mora (1) and three mora words (2) has the following realizations (blank space indicates mora):

(1) H H (2) H H L

Phonetically the left most H can be lowered to L if the vowel is laryngealized or devoiced/aspirated, thus yielding the following possible melodies (1A) H H (1B) L H (2A) H H L (2B) L H L

The "contour tone" has the following realizations in two mora (3) and three mora (4) words

(3) L LH  (or L MH) (4) L L H

So, I said that there are some issues in representing the Tukano system by the way we have done in the project. Thinking more carefully, I think we can have a level tone representation of the two mora in the root plus floating tones: L in the "register tone" and H in the "contour tones". The only thing we would not capture is the contour tones as in (3). But then we could simply use IPA ̌ .

does that sound a good solution?

if so, then comes the hard part to transform the data : (

levmichael commented 8 years ago

Thanks for this @thiagochacon. It seems to me that at the quasi-phonemic level, the representations that you provide about in (1)-(4) are exactly what we want. Whether they are visually depicted with acute and grave accent marks, superscripted letters, or something else is a separate matter, but I think the level of representation we want is exactly what you have provided here. What are your thoughts?

thiagochacon commented 8 years ago

Indeed, lev, but as you can see there are different patterns and that was the reasom why I asked my RAs not to change Ramirez orthographic/phonological representation. If we are to transform it by hand it will take forever

Enviado do meu smartphone Samsung Galaxy.

-------- Mensagem original --------
De: levmichael notifications@github.com
Data: 20/02/2016 10:11 (GMT-05:00)
Para: digling/tukano-project tukano-project@noreply.github.com
Cc: thiagochacon thiago_chacon@hotmail.com
Assunto: Re: [tukano-project] Tukano minor issues (#16)
Thanks for this @thiagochacon. It seems to me that at the quasi-phonemic level, the representations that you provide about in (1)-(4) are exactly what we want. Whether they are visually depicted with acute and grave accent marks, superscripted letters, or something else is a separate matter, but I think the level of representation we want is exactly what you have provided here. What are your thoughts?


Reply to this email directly or view it on GitHub: https://github.com/digling/tukano-project/issues/16#issuecomment-186622782

levmichael commented 8 years ago

Agreed -- it would be slow. Well, there are a number of competing desiderata that we need balance here, and it will probably be most efficient to discuss verbally, so let's hold off on further discussion until the skype meeting.