digling / tukano-project

Repository for the Tukano project (discussions and automatic data analyses)
GNU General Public License v3.0
0 stars 0 forks source link

floating tones #26

Open nataliacp opened 8 years ago

nataliacp commented 8 years ago

In order to find the best way to represent floating tones, I think it is important that we all understand the phenomenon. So, I did some basic reading on sandhi phenomena and on floating tones and here is what I understood. Disclaimer: I may not be using the most appropriate terminology sometimes, I am still learning!

Sandhi: From what I understand, these phenomena are about tone interaction, and they don't require an extra entity to be invoked. So, always (or almost always) such and such tone before or after such and such tone would be pronounced as this new different tone. This reminds me of phonological rules, but involving tones, not segments

Floating tones: Here we have an entity (a tone without a segment) that is associated with a particular morpheme and it interacts with other neighboring morphemes, by altering their surface tone. In many cases, it was explicit that historically there was a segment associated with this floating tone which was subsequently lost. In this sense, it reminds me of nasal coda vowels, which are left over after the nasal coda consonant disappears.

So, yes, the two phenomena are about tone interaction, but there is a key difference: the presence or not of a historical entity. Crucially a floating tone is associated arbitrarily with a particular root, it is not a rule of tonal interaction that applies throughout the language.

If this previous description is accurate enough, I think we need to track floating tones as entities in our alignment. In some languages we may find that they correspond to no floating tones, or to a segment. So, I think that we need a notation that

  1. gives them their own column in the alignment
  2. makes it clear which root they are associated with and where they are manifested (before or after)
  3. differentiates them from "regular" tones
LinguList commented 8 years ago

Well, I was exactly pointing to this phenomenon in the context of Chinese, where you have the same surface tone that shows two different Sandhi forms, depending on it's origin (usually tonal categories of Middle Chinese, which go themselves back to initials and phonation differences in the ancestor).

So you may have a syllable which is pronounced [55](high flat) in the tone, but reflecting two different original tones in Middle Chinese. As a result, their Sandhi is different with other syllables:

For details, see an early draft on this:

So in order to reflect this, one needs the citation tone and the sandhi tone.

Here, we also have arbitrary assignment of underlying tones, but the change applies to the syllable carrying the underlying tone, not the syllable that follows or precedes it. But I'm sure one can find examples in SEA where underlying tones also influence tonal quality of what precedes or what follows.

I'd like to ask the experts to give me a couple of clear-cut examples preferably with

  1. tone of syllable in isolation
  2. tone of syllable in context

and all in some IPA transcription, so that I can understand where the differences are here, since I still don't see them.

nataliacp commented 8 years ago

Thanks for this explanation Mattis. I still think that the example you mention is more like a phonological rule. It is as if two phonemes (here tones) have merged in some contexts but not all, thus revealing that historically they are different entities. However, the tone is still associated with a syllable, it is not a separate entity.

LinguList commented 8 years ago

But what does it mean that a tone is a separate entity? It is a lexical entity, since it is not encoded in the syllable (you can have two syllables that surface IDENTICALLY, but have different tone sandhi, so you have syllable ma⁵⁵, one time meaning "horse", one time "hemp", but it will have Sandhi ma²¹⁴ one time for "horse", and ma³⁵ the other time for "hemp", so not phonological but lexical). Does the floating tone have a meaning on its own then? And it slips from word to word? So, one word that means the same one time has a floating tone, and one time not? Please, Tukanoanists, explain this, and please also give examples.

amaliaskilton commented 8 years ago

Here is what floating tones are like in Tukanoan. This is a toy example because the languages I am most familiar with do not have floating tones and I don't want to make any inaccurate claims about other lgs.

Let's imagine we have a language which contrasts 3 surface melodies on verb roots: HH bábá- LL bàbà- HL bábà-

Now let's imagine that we have a suffix /ta/ which can attach to any of these verb roots. When we attach /ta/ to the roots, we find that /ta/ is L following the HH root and the HL root: HH-ta = HHL bábátà HL-ta = HLL bábàtà

We decide that /ta/ has an underlying L tone. But then when we attach /ta/ to LL bàbà, something unexpected happens: /ta/ shows up as H.

LL-ta = bàbàtá

We look at other LL verb roots and find out that they are actually in two classes. Some LL verb roots make /ta/ surface H, like bàbà. Other LL verb roots make /ta/ surface as L, like it does after HL and HH:

LL-ta #1 = LLH bàbàtá LL-ta #2 = LLL pàpàtà LL-ta #3 = LLH kàkàtá LL-ta #4 = LLL gàgàtà

From facts like this, we reach the conclusion that not all LL verb roots are the same. Some give following /ta/ (or maybe all morphs) an H tone, some allow /ta/ to surface with its apparently underlying tone, an L. So we annotate LL verb roots as belonging to two different phonological classes: LL with no floating tone (allows /ta/ to take its UR tone) and LL with a floating H tone at the right (makes /ta/ surface as H). Thus our fake language now has 4 phonological classes for verb roots:

HH bábá HL bábà LL bàbà LL-H pàpà

We could even have two verb roots, bàbà- and bàbàH-, that are identical "in isolation" but differ in their effect on a following morph such as /ta/: with the affix, bàbà- LL becomes bàbàtà LLL, but bàbàH- LL-H- becomes bàbàtá LLH. Minimal pairs like this (homophonous tones on the morph itself, different tones imposed on the following morph) are common in African tone languages and also exist in some ET languages.

This is different from tone sandhi in that the diacritic feature tells you about what tone the next morph will take, rather than about the tone the focus morph itself will have in different tonological contexts.

LinguList commented 8 years ago

Thanks for the explanation!

But the phenomenon is functionally the same with what we can see in Tone sandhi: you have to actually distinct lexemes which surface identically (as homophonous words, or, say, two classes of lexemes), but in certain combinations they will surface differently, or cause their neighbors to surface differently, thus triggering a distinctioin they don't have in isolation, and revealing their former distinction also in isolation. The only difference is that in your example, the distinction surfaces on the following element, while it would surface on the same element in my example. But there's no logical reason in tone sandhi, as I said, that this could not also happen (only that I don't have an example for this, but I can ask my colleagues here, who probably know). The assumption is here, that contrary to normal processes in which distinctions are lost in combination is that combination preserves distinctions, yet where they surface is actually just dependent on the individual developments.

In my example, the words san⁵⁵ would have two meanings, being not distinguished in pronunciation in isolation, but having distinction in pronunciation in combination, by revealing that sa⁵⁵ goes back to two formerly different pronunciations (say, sa³⁵ and sa²¹⁴). But Sandhi can influence to the left, to the right, and even to both sides, depending on some constraints and rules (they loves this in OT). So there's no logical reason why we would not have cases where, say san⁵⁵ is always san, but will trigger different sandhi in following or preceding words, depending on its origin.

And handling this will actually mean that you need to find a way to assign suprasegmentally, which elements are different, although they surface identically. If you then are able to further distinguish the direction where the influence goes, and if this is always regularly some low-tone, or something predictable, just assign the syllable under question a tone followed or preceded by an asterisk (provided you know where the direction for tone goes:

And in combination, you should ideally indicate the tone change by using some sandhi-annotation (although there are different ones, like, e.g.:

or

or

or a little superscript dash between source and target for the second syllable.

I see why it makes sense to understand those words which show distinction only in combination, but it's nothing one could not address in a way that is still pleasing to the eye.

amaliaskilton commented 8 years ago

I think the first of your three bullet-pointed options would be the easiest to get used to, since the convention most of us use is bàbàH- or bàbà(H) rather than something with an asterisk.

The primary difference is indeed that the alternation occurs on the following morphs and not on the root. Another important difference from Chinese-style tone sandhi is that (from my understanding) all morphs in Chinese have their own tones, while in Tuk the tones of many morphs are determined exclusively from the preceding tones in the prosodic word.

On Thu, Feb 25, 2016 at 3:11 PM, Johann-Mattis List < notifications@github.com> wrote:

Thanks for the explanation!

But the phenomenon is functionally the same with what we can see in Tone sandhi: you have to actually distinct lexemes which surface identically (as homophonous words, or, say, two classes of lexemes), but in certain combinations they will surface differently, or cause their neighbors to surface differently, thus triggering a distinctioin they don't have in isolation, and revealing their former distinction also in isolation. The only difference is that in your example, the distinction surfaces on the following element, while it would surface on the same element in my example. But there's no logical reason in tone sandhi, as I said, that this could not also happen (only that I don't have an example for this, but I can ask my colleagues here, who probably know). The assumption is here, that contrary to normal processes in which distinctions are lost in combination is that combination preserves distinctions, yet where they surface is actually just dependent on the individual developments.

In my example, the words san⁵⁵ would have two meanings, being not distinguished in pronunciation in isolation, but having distinction in pronunciation in combination, by revealing that sa⁵⁵ goes back to two formerly different pronunciations (say, sa³⁵ and sa²¹⁴). But Sandhi can influence to the left, to the right, and even to both sides, depending on some constraints and rules (they loves this in OT). So there's no logical reason why we would not have cases where, say san⁵⁵ is always san, but will trigger different sandhi in following or preceding words, depending on its origin.

And handling this will actually mean that you need to find a way to assign suprasegmentally, which elements are different, although they surface identically. If you then are able to further distinguish the direction where the influence goes, and if this is always regularly some low-tone, or something predictable, just assign the syllable under question a tone followed or preceded by an asterisk (provided you know where the direction for tone goes:

  • b a ¹ for the normal syllable
  • b a ¹* for the syllable triggering change to the right
  • b a *¹ for the syllable triggering change to the left.

And in combination, you should ideally indicate the tone change by using some sandhi-annotation (although there are different ones, like, e.g.:

  • b a ¹* l a ¹⁽⁵⁾

or

  • b a ¹* l a ¹·⁵

or

  • b a ¹* l a ¹₅

or a little superscript dash between source and target for the second syllable.

I see why it makes sense to understand those words which show distinction only in combination, but it's nothing one could not address in a way that is still pleasing to the eye.

— Reply to this email directly or view it on GitHub https://github.com/digling/tukano-project/issues/26#issuecomment-189026617 .

LinguList commented 8 years ago

while in Tuk the tones of many morphs are determined exclusively from the preceding tones in the prosodic word.

That is indeed a difference.

BTW: I think if there's just H and L, I would opt for this annotation with superscripts, but it is important to stick to capitals, since normal superscripts are reserved for aspiration ( see wikipedia on superscripts in general).

So you could write things as

and maybe it's better not to use an asterisk, which is again used elsewhere, so using brackets is probably better for the annotation of the floating tone stuff:

and maybe use a superscript x to indicate direction of influence (if this matters, so far it seems only to work to the right?)

or the little dot (theres no underscore in superscript, which would probably be nicest

Do you want to annotate the sandhi in the following syllables? Then more conventions are needed.

levmichael commented 8 years ago

This is quite close in spirit to the way Tukanoanists already 'traditionally' represented floating tones. If there are any leftwards spreading floating tones (and here and ET specialist will need to speak up), I'd be inclined for the following, more iconically transparent, representation:

This seems to me to be simpler that additional superscripts indicating direction of spread.

amaliaskilton commented 8 years ago

There are left edge floating tones in Barasana and Tatuyo.

I don't have very strong opinions about the superscript representations, but a question: are capital superscripts available in IPA fonts like Charis? I can't find them on my IPA keyboard.

On Friday, February 26, 2016, levmichael notifications@github.com wrote:

b a ᴴ b a ᴴ⁽ᶫ⁾ This is quite close in spirit to the way Tukanoanists already 'traditionally' represented floating tones. If there are any leftwards spreading floating tones (and here and ET specialist will need to speak up), I'd be inclined for the following, more iconically transparent, representation:

  • ⁽ᶫ⁾b a ᴴ b a ᴴ This seems to me to be simpler that additional superscripts indicating direction of spread.

— Reply to this email directly or view it on GitHub https://github.com/digling/tukano-project/issues/26#issuecomment-189308311 .

LinguList commented 8 years ago

@levmichael, I prefer to avoid any representation which separates tones over different positions in the word, since this is first visually not pleasing in alignments, but also difficult to handle in algorithms.

@amaliaskilton, Unicode and IPA keyboard are two different things. Unicode offers many more characters than IPA.

levmichael commented 8 years ago

@LinguList, I think we are quite clear on the differences between Unicode and IPA; I think @amaliaskilton's question was a practical one about how to generate the Unicode symbols for superscript H and L, since they don't appear on the IPA keyboards that linguists tend to use.

levmichael commented 8 years ago

@LinguList, OK, I see what you're thinking. What about:

LinguList commented 8 years ago

opposed to which alternative annotation?

levmichael commented 8 years ago

As opposed to ones that require an additional symbol to indicate the direction of association of floating tones. Here the convention would be floating tones associated with leftmost syllable associate leftwards, and floating those associated with the rightmost syllable would associate rightwards.

LinguList commented 8 years ago

if there are no three-syllabic words, and no word has two floating tones, and you indicate morpheme boundaries in case stuff is attached, this is feasible, although you should keep in mind that in cases where you have cognate floating tones which spread in different directions, you won't be able to see that in the alignment...

levmichael commented 8 years ago

I'm not sure all the restrictions you stipulate above are necessary under the representation conventions I suggested. So here is an example of a three syllable word with a leftward-associating LH tone and a rightward associating HL tone.

*b a ᴴ⁽ᶫᴴ⁾ k aᶫ b a ᴴ ⁽ᴴᶫ⁾

This doesn't seem problematic to me, but what do you think, @LinguList?

LinguList commented 8 years ago

I was more thinking of situations where you have a word, like, say:

from language x, and a word like

from language y, and these words are cognate, and you align them, and then you do not align the information regarding floating tone, since you have spread it over two syllables.

b a ᴴ      k a ᶫ⁽ᶫᴴ⁾
b a ᴴ⁽ᶫᴴ⁾  k a ᶫ

If that is unlikely to happen, I think it's fine.

levmichael commented 8 years ago

I see your point. I think that the concern that you express is the tip of the iceberg, in the sense that I can also easily see wanting to align floating with tones that are associated with particular TBUs, which will be similarly problematic. At this stage, it is not clear to me which of the types of conventions that we are considering would actually be superior, so my personal inclination is to adopt the convention I suggested above, which is closer to the way Tukanoanists often represent floating tones -- i.e. as tone letters preceding or following the morpheme. I'm curious what my fellow Tukanoanists, and especially Elsa, think of this issue.

amaliaskilton commented 8 years ago

My impression is that floating tones in Eastern Tukanoan languages are quite often due to diachronic loss of a vowel with preservation of its tone. I believe apheresis is the source of floating tones in Kotiria (but I do not have Kris' book at hand to check) and in our skype conversation Elsa mentioned segment+tone prefixes in Tatuyo which are only tones in Barasana. So it is extremely likely we will need to align floating tones with tones anchored to segments.

On Saturday, February 27, 2016, levmichael notifications@github.com wrote:

I see your point. I think that the concern that you express is the tip of the iceberg, in the sense that I can also easily see wanting to align floating with tones that are associated with particular TBUs, which will be similarly problematic. At this stage, it is not clear to me which of the types of conventions that we are considering would actually be superior, so my personal inclination is to adopt the convention I suggested above, which is closer to the way Tukanoanists often represent floating tones -- i.e. as tone letters preceding or following the morpheme. I'm curious what my fellow Tukanoanists, and especially Elsa, think of this issue.

— Reply to this email directly or view it on GitHub https://github.com/digling/tukano-project/issues/26#issuecomment-189756803 .

LinguList commented 8 years ago

In the end, all decisions go also back to questions of frequency. For example, how often will FT point to the left, how often will it point to the right, and how often are there cognate words which developed FT in two daughter languages pointing in different directions. If it's one case in the end that does not even occur in your lists, I'd just ignore it, if it's more, I'd reconsider.

It's up to you all, since my major request from the computational perspective in having a fixed place for tone in the end of syllables to faclitate and prettify alignments and having a regularized representation with superscript are answered by @levmichael's solution.

thiagochacon commented 8 years ago

I initially like Lev's suggestions for the representation of floating tones, but I think it would be a bit problematic for monosyllables.

Let me summarize things to see if I fully understand it (which may serve for others as well).

Is this correct?

If so, how do we handle monosyllable/monomoraic forms? If kaH(L) means a floating tone associating to the right, should (L)kaH means a a floating tone to the left?

Then the form b a ᴴ⁽ᶫᴴ⁾ k aᶫ b a ᴴ ⁽ᴴᶫ⁾ would be better ti be writen as ⁽ᶫᴴ⁾b a ᴴ k aᶫ b a ᴴ ⁽ᴴᶫ⁾

What do you guys think?

levmichael commented 8 years ago

@thiagochacon -- yes, that's exactly what I would prefer! But I was trying not to stray too far from @LinguList's original idea...

LinguList commented 8 years ago

Okay, but what about cases where FT is in a compound with one mono-syllable and something before and after? You see, computationally, this means there is no way to guess where the FT belongs to, and since I'm looking for solutions feasible also for other language families, not only for TK, it is important that all things are thought about, and here, putting a tone in front of a syllable instead of after is just beraing the danger of getting ambiguous.

So why not stick with the original idea of using an asterisk or a superscript dot or dash to show where the tone in brackets attaches to as FT? Computationally, we should never separate information which belongs together. So @thiagochacon's example would become:

Note that there are no spaces between original tone annotation and floating-tone annotation, which is now handled as an extra information for a given tone.

LinguList commented 8 years ago

Just to re-inforce problematicity of spreading the information, think of the opposite case, where the syllables are shuffled:

This is ambiguous and problematic and parsing will be an obstacle.

And if there are monosyllables and bisyllables, and FT can apply to either of them, don't forget about morpheme segmentation! So given @thiagochacon's point with mono-syllables, for FTs we need:

This is in my view the most consistent solution, and the general handling of using brackets for extra-annotation is also computationally easy to handle and expandable to other use-cases.

thiagochacon commented 8 years ago

ok, @LinguList , so you don't want to separate the information and don't want to have syllable edges (left or right) indicating direction of association.I see your point. I think more diacritics would make the data less elegant, but maybe separating the information gives you the same impression too.

I would leave for you and @sflavier to decide how to proceed on this.

levmichael commented 8 years ago

I'm interested in hearing minimally what @amaliaskilton and Elsa think about this issue.

amaliaskilton commented 8 years ago

I think the superscript dash is a fine solution and easy to produce from what we already have. I'm not aware of morphs which have floating tones on both edges in any Tukanoan language (in fact I can hardly think of examples like that from the better-studied African and SEA tone languages).

I agree that floating tones make morpheme boundaries more important, and I am fine with any notation for morph boundaries that distinguishes the categories I mentioned on the other thread (prefix-root, root-root, root-affix). But this brings up some new questions for me, namely: How will we handle morphs that have tones with no segmental content? Will they be represented solely as floating tones?

On Sun, Feb 28, 2016 at 3:55 PM, levmichael notifications@github.com wrote:

I'm interested in hearing minimally what @amaliaskilton https://github.com/amaliaskilton and Elsa think about this issue.

— Reply to this email directly or view it on GitHub https://github.com/digling/tukano-project/issues/26#issuecomment-189972595 .

gomezimb commented 8 years ago

Just to make sure that we are referring to the same entity when we talk about Floating Tones FT, let us work with a real ET language. Given the following representations, could you derive the surface tones?

⁽ᴴ⁾ idi ᴴ ⁽ᶫ⁾ re ⁽ᴴᶫ⁾ b a a ᴴ re

nataliacp commented 8 years ago

I agree with Lev's initial proposal to have left edge floating tones superscript and before the word, exactly where we expect them to surface. This is the natural place of any other surface tone they are cognate to in case it is floating or not. so, if the word in one language is aHbaLbaL and in the other (H)maLmaL, then the surface tone of the one and the floating tone of the other are perfectly alignable. (I mean them superscript, but I can't find how to do it on my keyboard, sorry). As for differentiating tones for the automatic alignment, I think it is easier to define them by their symbols rather than their positions. If for example we use only numbers or only H and L or only superscript H and Ls.

levmichael commented 8 years ago

@gomezimb, are those underlying representations? (In the quasi-phonemic representation, every TBU should have a tone assigned to it, which obviates the need to derive the surface tones.) But to answer your question, I can't tell you what the surface tones are without knowing what the spreading rules are for the tones in the language.

levmichael commented 8 years ago

So, to summarize the current state of the discussion, we have two proposals on the table, which I believe differ only on one point, namely, whether directionality of floating tones is indicated by whether the floating tone is indicated by: 1) the position of the floating tone with respect to its associated syllable (preceding for leftward associating, following for rightward); or 2) the use of a diacritic within the floating tone parentheses. @LinguList's point point in favor of the latter, which I certainly appreciate, is that it is less likely to result in parsing ambiguities. I think those of us who lean toward the first option are drawn by its similarity to extant Tukanoanist conventions.

Having thought about it, I don't see either proposal as clearly far superior to the other (each having advantaged and disadvantages of their own), so I am inclined to defer to the people who will have to deal with implementing whichever proposal we adopt, either with respect to the data (especially @amaliaskilton and @thiagochacon) or with respect to the tools (@nataliacp, @LinguList, and Seb).

@nataliacp, do you think we should, as @thiagochacon intimated above, bring Seb into this discussion to see if he has any strong feelings?

nataliacp commented 8 years ago

Note that there are no spaces between original tone annotation and floating-tone annotation, which is now handled as an extra information for a given tone.

In re-reading this thread while preparing to discuss it with Seb, the quote above caught my attention. @LinguList do you mean that the surface tone and the floating tone in say H(L) occupy one column in the alignment? I think they absolutely need to be two separate columns, based on everything we have said up to now. and in case they are separate columns, why does it matter which side of the syllable the floating tone is?

thiagochacon commented 8 years ago

Indeed, floating tone and realized tones are different entities and must be in a different cell for alignment. It is not the case that a floating tone is "a tone" that modifies another tone.

LinguList commented 8 years ago

Okay, now before we go any further with this, I want to be presented an alignment of at least five different words across five tukano languages with and without the floating tones, and please, put it in different representations (FT anywhere, iconic, whatever,).

I want to have a clear assessment regarding the degree of the problem, the uglyness of the solutions, etc.

And if you don't have morpheme segmentation, iconic listing of FTs won't work, I hope this is understood, since you'll never know where to attach the thing in brackets. So if you don't have morphemic annotation, you can't use the iconic style.

And please note that the FT as some suprasegmental entity is actually something that you can put virtually ANYWHERE, as long as you make sure to which syllable or morpheme it applies, so you can, as I think is best, just attach it to some other tone so it doesn't look to ugly in the alignments.

But please give me some multiple alignments, and let's discuss this on the data ;-)

LinguList commented 8 years ago

BTW, you can nicely align stuff here by writing table,s just like this:

. . . . .
b a b b c
b a c c (d)

Just expand the template (| is a cell-delimiter, the first two lines are important to show it's a table), and it's easier to follow each other when talking the ugliness and beauty of alignmetn solutions...

amaliaskilton commented 8 years ago

Since there are no floating tones in the Western languages either @thiagochacon or @gomezimb will have to give you the real examples. I am not sure there are any cases where, in a single cognate set, five languages all have floating tones.

On Wed, Mar 2, 2016 at 7:52 AM, Johann-Mattis List <notifications@github.com

wrote:

BTW, you can nicely align stuff here by writing table,s just like this: . . . . . b a b b c b a c c (d)

Just expand the template (| is a cell-delimiter, the first two lines are important to show it's a table), and it's easier to follow each other when talking the ugliness and beauty of alignmetn solutions...

— Reply to this email directly or view it on GitHub https://github.com/digling/tukano-project/issues/26#issuecomment-191298096 .

LinguList commented 8 years ago

that's exactly what I want: five languages, but not necessarily floating tones. I just want to see to which degree the iconicity in cognate words makes it difficult to align the floating tones, and to what extend the independent annotation of floating tones is important for the alignment.

Note that I'm not against this per se, although I have the gut feeling (based on other languages where we have comparable phenomena of words which have distinctions which only surface in context) that these things are not necessarily annotated independently, but I want to see clear examples before we decide on any format...

gomezimb commented 8 years ago

I'm preparing a sample of TAT & BAS cases to send you today.

thiagochacon commented 8 years ago

I have the same plan too for kub and tuk

Enviado do meu smartphone Samsung Galaxy.

-------- Mensagem original --------
De: gomezimb notifications@github.com
Data: 03/03/2016 06:53 (GMT-03:00)
Para: digling/tukano-project tukano-project@noreply.github.com
Cc: thiagochacon thiago_chacon@hotmail.com
Assunto: Re: [tukano-project] floating tones (#26)
I'm preparing a sample of TAT & BAS cases to send you today.


Reply to this email directly or view it on GitHub: https://github.com/digling/tukano-project/issues/26#issuecomment-191685154

thiagochacon commented 8 years ago

See if that is useful at all

CVCV weL koH (L) /HL/ KUB weL koL (H) /L(H)/ TUK 'parrot'

muL haH /H/ KUB moL saH (L) /H/ TUK 'achiote'

oL peH (L) /HL/ KUB oL peH (L) /H/ TUK 'breast'

CVV hiL aH (L) /HL/ KUB diH aH (L) /H/ TUK 'river'

kɨL iH (L) /HL/ KUB kiL iL (H) /L(H)/ TUK 'manioc'

CV(CV) puL (H) (L) /HL/ KUB puL tiL (H) /L(H)/ TUK 'blow'

gomezimb commented 8 years ago

I attach the file on TAT and BAS tones. I tried to be as clear as possible, but if I failed please do not refrain and ask questions.

TAT_BAS_tones.pdf

LinguList commented 8 years ago

well, how would you align these things?

as an example, consider @thiagochacon's first example ("parrot"):

Language ... ... ... ... ... ... ... ...
KUB w e L k o H (L) /HL/
TUK w e L k o L (H) /L(H)/

It makes more sense to have underlying and surface tone together:

Language ... ... ... ... ... ... ...
KUB w e L k o H/HL (L)
TUK w e L k o L/H (H)

And then I still have problems seeing the placement of floating tones which I would assign to the normal tones, since tones are also a suprasegmental thing. But potentially, just give me some examples, how you imagine the alignments of cases where we have languages with floating tones and without, etc.

So my major question is: how will it look with the iconic representation of floating tone? Is this consistent or ambiguous? How does it look in multiple alignments?

LinguList commented 8 years ago

Alternatively, you may just create a fake alignment in the EDICTOR and show it here as a screenshot...

thiagochacon commented 8 years ago

I wouldn't like to see this kind of alignment you suggested, where tones and vowels are aligned each at the same time.

Rather in one alignment form, I think surface tone should go along with vowels, floating tone by itself.

In another alignment form, tone should be align separately from vowels

In addition, let's not mix up surface and underlying tones in the alignment! They can be part of the same lexical representation, but not in the there for alignments.

Thus:

Alignment 1 KUB weL koH (L) TUK weL koL (H)

Alignment 2 (where this alignment is linked to the cognate IDs for "weko", "weko", etc.) KUB L H (L) TUK L L (H)

Date: Fri, 4 Mar 2016 05:21:45 -0800 From: notifications@github.com To: tukano-project@noreply.github.com CC: thiago_chacon@hotmail.com Subject: Re: [tukano-project] floating tones (#26)

well, how would you align these things?

as an example, consider @thiagochacon's first example ("parrot"):

Language ... ... ... ... ... ... ... ...

KUB w e L k o H (L) /HL/

TUK w e L k o L (H) /L(H)/

It makes more sense to have underlying and surface tone together:

Language ... ... ... ... ... ... ...

KUB w e L k o H/HL (L)

TUK w e L k o L/H (H)

And then I still have problems seeing the placement of floating tones which I would assign to the normal tones, since tones are also a suprasegmental thing. But potentially, just give me some examples, how you imagine the alignments of cases where we have languages with floating tones and without, etc.

So my major question is: how will it look with the iconic representation of floating tone? Is this consistent or ambiguous? How does it look in multiple alignments?

— Reply to this email directly or view it on GitHub.

LinguList commented 8 years ago

well then I suggest for the form to be passed to lingpy, and also for ease of discussion, etc. that we just omit the floating tones, or even the tones, there, (that would be the FUN field, right?) and if more specific alignments are needed, this can be done in additional forms using the tone data, using reflexes full alignment flexibility. This would also avoid any further discussion of where you annotate the floating tones, since my arguments only apply to automatic procedures where I need to insist on consistency in representation, since otherwise the algorithms won't work.

thiagochacon commented 8 years ago

I have always felt that it would be simpler for aligning tones and segments in separate fields, but that would make us loose potential cases of tone and segments co-evolution.

My suggestion is the following: 1- In Reflex, we have the PHT, PHM and FUN representations as we discussed them already (this is done, I think) 2- When importing to Lingpy, we remove tones 3- When reimporting to Reflex we add the tones again to the aligned forms

If 3 is possible, then I would be ok with 2.

Date: Fri, 4 Mar 2016 05:49:48 -0800 From: notifications@github.com To: tukano-project@noreply.github.com CC: thiago_chacon@hotmail.com Subject: Re: [tukano-project] floating tones (#26)

well then I suggest for the form to be passed to lingpy, and also for ease of discussion, etc. that we just omit the floating tones, or even the tones, there, (that would be the FUN field, right?) and if more specific alignments are needed, this can be done in additional forms using the tone data, using reflexes full alignment flexibility. This would also avoid any further discussion of where you annotate the floating tones, since my arguments only apply to automatic procedures where I need to insist on consistency in representation, since otherwise the algorithms won't work.

— Reply to this email directly or view it on GitHub.

LinguList commented 8 years ago

you can also leave the tones in the alignments with lingpy, just remove the floating tones, or attach them to the tones, as specified. Since they won't influence the alignment in any way (one would need to re-invent an algorithm working with that information), they need to be put somewhere, where they are not treated as segments, and this is why I want to attach them to the tones. Alternatively, delete them in lingpy alignments, and re-attach them.

Yet basically, tones would not be a problem for the lingpy alignment, only adding them in inconsisten places based on the iconicity-representation idea is not feasible.

thiagochacon commented 8 years ago

deleting floating tones for the purpose of lingpy parsing and analysis is ok with me, if we can restore them when importing back to Reflex

Date: Fri, 4 Mar 2016 06:13:59 -0800 From: notifications@github.com To: tukano-project@noreply.github.com CC: thiago_chacon@hotmail.com Subject: Re: [tukano-project] floating tones (#26)

you can also leave the tones in the alignments with lingpy, just remove

the floating tones, or attach them to the tones, as specified. Since

they won't influence the alignment in any way (one would need to

re-invent an algorithm working with that information), they need to be

put somewhere, where they are not treated as segments, and this is why I

want to attach them to the tones. Alternatively, delete them in lingpy

alignments, and re-attach them.

Yet basically, tones would not be a problem for the lingpy alignment,

only adding them in inconsisten places based on the

iconicity-representation idea is not feasible.

— Reply to this email directly or view it on GitHub.

LinguList commented 8 years ago

that's why I suggset to attach them in brackets to the tones, since in this way, they're still there and can be easily restored.

thiagochacon commented 8 years ago

I see and now I understand where you are coming from. However, I guess we need to separate what is the ideal linguistic representation and the capabilities of Reflex and LingPy to handle these representations. If Reflex can structure the data closer to the way we want it represented and then we just need to adapt things so that Linpy can perform better its job, while still keeping the original data structure when going back to Reflex, I thinks this would be preferred.

Otherwise we will need a more practical solution, which would make it uglier for linguistis, but prettier for the computers... : )

Date: Fri, 4 Mar 2016 06:29:59 -0800 From: notifications@github.com To: tukano-project@noreply.github.com CC: thiago_chacon@hotmail.com Subject: Re: [tukano-project] floating tones (#26)

that's why I suggset to attach them in brackets to the tones, since in

this way, they're still there and can be easily restored.

— Reply to this email directly or view it on GitHub.