digling / tukano-project

Repository for the Tukano project (discussions and automatic data analyses)
GNU General Public License v3.0
0 stars 0 forks source link

floating tones #26

Open nataliacp opened 8 years ago

nataliacp commented 8 years ago

In order to find the best way to represent floating tones, I think it is important that we all understand the phenomenon. So, I did some basic reading on sandhi phenomena and on floating tones and here is what I understood. Disclaimer: I may not be using the most appropriate terminology sometimes, I am still learning!

Sandhi: From what I understand, these phenomena are about tone interaction, and they don't require an extra entity to be invoked. So, always (or almost always) such and such tone before or after such and such tone would be pronounced as this new different tone. This reminds me of phonological rules, but involving tones, not segments

Floating tones: Here we have an entity (a tone without a segment) that is associated with a particular morpheme and it interacts with other neighboring morphemes, by altering their surface tone. In many cases, it was explicit that historically there was a segment associated with this floating tone which was subsequently lost. In this sense, it reminds me of nasal coda vowels, which are left over after the nasal coda consonant disappears.

So, yes, the two phenomena are about tone interaction, but there is a key difference: the presence or not of a historical entity. Crucially a floating tone is associated arbitrarily with a particular root, it is not a rule of tonal interaction that applies throughout the language.

If this previous description is accurate enough, I think we need to track floating tones as entities in our alignment. In some languages we may find that they correspond to no floating tones, or to a segment. So, I think that we need a notation that

  1. gives them their own column in the alignment
  2. makes it clear which root they are associated with and where they are manifested (before or after)
  3. differentiates them from "regular" tones
amaliaskilton commented 8 years ago

@LinguList : I didn't realize we could use different representations for lingpy and reflex. If that's true, I'd suggest we circumvent this issue by not using tones for automatic alignment at all. From my lexical knowledge of the 740 list, there is no language in which we have a significant number of minimal tone pairs/sets in the data - the maximum number of minimal sets that will appear in the list is probably less than 10 per language. And when I have made cognate sets by hand, it has rarely been crucial to consult the tones at all in order to make a cognate judgment (the only cases where I have needed to do that have involved the glottal stop to low tone change in Mai). (but @thiagochacon should jump in if he thinks tones are crucial to cognate judgments in ET!) For those reasons I tend to doubt that including/excluding either floating tones or all tones will make much of a difference to our results from automatic alignment.

On Friday, March 4, 2016, thiagochacon notifications@github.com wrote:

I see and now I understand where you are coming from. However, I guess we need to separate what is the ideal linguistic representation and the capabilities of Reflex and LingPy to handle these representations. If Reflex can structure the data closer to the way we want it represented and then we just need to adapt things so that Linpy can perform better its job, while still keeping the original data structure when going back to Reflex, I thinks this would be preferred.

Otherwise we will need a more practical solution, which would make it uglier for linguistis, but prettier for the computers... : )

Date: Fri, 4 Mar 2016 06:29:59 -0800 From: notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com'); To: tukano-project@noreply.github.com javascript:_e(%7B%7D,'cvml','tukano-project@noreply.github.com'); CC: thiago_chacon@hotmail.com javascript:_e(%7B%7D,'cvml','thiago_chacon@hotmail.com'); Subject: Re: [tukano-project] floating tones (#26)

that's why I suggset to attach them in brackets to the tones, since in

this way, they're still there and can be easily restored.

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/digling/tukano-project/issues/26#issuecomment-192309836 .

gomezimb commented 8 years ago

Thiago, In: weL koL (H) /L(H)/ TUK 'parrot' the H is not floating but both L and H are linked to o, that is realized as a raising tone, described and transcribed by Ramirez (1997:69) as 'tom de contorno' (contour tone) and realized as MH. weL koLH /LH/ TUK

thiagochacon commented 8 years ago

Thanks for noticing that, Elsa.

As you can see, I try to explain that in the last lines of that message.

Treating the “contour” tone as a combination of an L tone plus an unassociated/floating H tone i actually my analysis, one that I am writing up for publication in Liames.

Basically for me, Tukano has /H/ and /L/ underlying tones. /H/ is what Ramirez called “Register tone”. The /L/ tone occurs on free and dependent roots as well as bound morphemes.

During tonal derivation, when a word formed by a root and bound morphemes would surface with only L tones, a constraint that all words must have a H tone (OBL_H) inserts the floating (H) at the right edge of the second foot of a prosodic word. If there is just one foot, then only there. Foots are bimoraic and every morpheme boundary ends a foot to the left.

Here is an example of derivations of words with only /L/ as an underlying tone

  1. UR

CVV /L/ CVCV /L/ CVCV-CV /L/ CVCV-CV-CV /L/ CVCV+CVCV /L/

  1. link /L/ tones

CV̀V̀ CV̀CV̀ CV̀CV̀-CV̀ CV̀CV̀-CV̀-CV̀ CV̀CV̀+CV̀CV̀

  1. OBL_H inserts a H tone in the right edge of the most available mora

CV̀V̌ CV̀CV̌ CV̀CV̀-CV̌ CV̀CV̀-CV̌-CV̀ CV̀CV̀+CV̀CV̌

I think this analysis is better than Ramirez or yours /LH/ because of several facts: 1) first, (H) is not a regular tone since it has a very specific distribution 2) second notice that a LH analysis is problematic because: i. there is tonal crowding L+H in the last mora of words which have the /L/ tone, but that crowding seems unmotivated since there are more moras than underlying tones ii. the (H) tone never spreads, contrary to /H/ and /L/. Instead of spreading it is acutally linked only to a single mora in a specific prosodic position 3) third, as I have observed, dependent roots (which are toneless or according to Ramirez have /L/ tone) surface with a contour tone when they are derived into independent stems 4) this analysis also explains “transparent” suffixes, i.e those where never a (H) can dock at. Basically, a transparent suffix does not project a foot of its on, but always together with the following suffix. 5) a floating (H) grammatical tone (Ramirez 1997:291)

So, basically, under my analysis, Tukano has a floating (H). I have more to say, including the derivation of /H/ tones which also fits the prosodic/metrical considerations above (for instance, under the prosodic/metrical analysis, we do not need to stipulate a (L) floating tone following /H/ tones because it would be predictable that the domain of /H/ spreading is initial foot/root).

Anyway, it will be nice to share with this group the manuscript as soon as I can finish some acoustic analysis.

levmichael commented 8 years ago

Greetings from Lima. Sorry to have been AWOL for a couple of days -- I was packing up to return to Peru and in transit. Some 'procedural' responses to upthread comments:

  1. I agree with @thiagochacon and @amaliaskilton that for LingPy purposes we simply strip the tones for automated alignment and narrow the discussion of tonal representations to specifically how they will be represented in RefLex. How does that seem to everyone?
  2. To the degree that we want to discuss the analysis of particular tonal systems, let's open a separate issue for that, and reserve this issue for general representation discussions. (Thanks, btw, @gomezimb for your document discussing TAT and BAS tone. Not that we ever doubted it, but these data make it clear that very interesting things will emerge from tonal correspondence sets.)
gomezimb commented 8 years ago

Thanks for your reply Thiago. I'm lost at this point. Weren't we talking about the $$ representation? At that level, in isolation, it seemed to me that we had $weL koLH$.

levmichael commented 8 years ago

It would be good to move this tonal representation discussion to closure, with the hope of reaching a group decision before I drop out of email contact (on Wednesday evening). Since it appears that we will stripping tone for purposes of LingPy alignment, it seems to me that we can select our RefLex representation with primarily Tukanoanist conventions in mind. It seems to me that Tukanoanists are generally fond of using orientation with respect to syllable edges, so for purposes of the quasi phonemic representation, I reiterate a temporary consensus that was achieved upthread:

  1. Tones are represented as superscripted H or L
  2. Non-floating surface tones associated with a TBU immediately follow the TBU
  3. Floating tones are distinguished from non-floating tones by appearing in parentheses
  4. Floating tones align with the syllable edges in the following manner: 1) leftwards-associating tones align with the left edges of syllables; 2) rightward-associating tones associate with the right edge of syllables, following non-floating tones.

The following form exemplifies all these principles, I believe:

Please comment on this, either agreeing, or if dis-agreeing, proposing specific fixes.

gomezimb commented 8 years ago

It's OK with me too.

nataliacp commented 8 years ago

I agree with this. I am pretty sure that this works fine in Reflex. @sflavier can you confirm this? I will be back at work on Tuesday and will verify.

thiagochacon commented 8 years ago

Agreed!

nataliacp commented 8 years ago

Just to let you all know, that we discussed today with Seb and everything seems fine. We can discuss the details with Mattis regarding which side the "removal" of tonal information will be implemented, but the result will be the same: the alignment will be done on segments only and then columns for the tones will be added. The representations will be iconic, in the sense that floating tones hanging before will be before the TBU. One minor issue is letters or numbers. Seb and I both think that numbers is more generally applicable (since you can have as many levels as you need, while with letters we are somewhat restricted with L, H, and M). That being said, there are plenty of language families where 3 levels are enough, so we understand the power of "custom".

to summarize, we would like to be sure that all symbols that will be used for tone are H, L, M and parentheses (all of them superscript) and then we can move on with adjusting the representations.

levmichael commented 8 years ago

to summarize, we would like to be sure that all symbols that will be used for tone are H, L, M and parentheses (all of them superscript) and then we can move on with adjusting the representations.

Yes, this is correct, as I understand. One very important question, however, is something that @amaliaskilton raised upthread: How do we input superscript Unicode H, L, (, and )? The Unicode IPA keyboards that linguists typically use do not have these symbols.