cldf-clts / clts

Cross-Linguistic Transcription Systems
https://clts.clld.org
13 stars 3 forks source link

Tone on syllabic consonants #107

Closed cormacanderson closed 3 years ago

cormacanderson commented 3 years ago

I have data with tone marked on (syllabic) sonorants. The data I'm looking at is from Serbo-Croat, but apparently this is possible elsewhere: (from Wikipedia) "Tone is most frequently manifested on vowels, but in most tonal languages where voiced syllabic consonants occur they will bear tone as well. This is especially common with syllabic nasals, for example in many Bantu and Kru languages, but also occurs in Serbo-Croatian. " Once we put a syllabic marker on a sonorant, it puts it in the nucleus, where it's little wonder it can act like a vowel.

I don't really see any way of dealing with these cases under the current system. Tone is, naturally enough, dealt with as a vocalic feature in https://github.com/cldf-clts/clts/blob/master/pkg/transcriptionsystems/bipa/diacritics.tsv. I'm not suggesting adding it also to consonants, but the reality is that the syllabic marker means that the sonorant in question isn't really a conventional consonant any more.

Have you any suggestions @LinguList

LinguList commented 3 years ago

This is giving me headaches. What I do now in my concrete data is that I represent vocalic nasals as follows, depending on their (known) origin and phonotactics:

n̩ -> n̩/n Ø/ə

This means: I preserve information, but make a fake vowel in there.

LinguList commented 3 years ago

The problem is that you typically align those in a complex manner, so it is useful to represent them not as normal vowels either, so the following looks awkward:

- n̩ -
n a
- ã
- a n
LinguList commented 3 years ago

So for concrete practice: make a different representation, using the slash construct. For the future: separate tone and segment information, as a rule. This is what I will do in the future in most cases: tone is represented in an additional tier.

cormacanderson commented 3 years ago

Okay, fair enough. The primary use case for me at the current task is to check if the transcriptions are IPA compliant, which these sonorants with tone actually are. I suppose I just use the orthography profile to change the representation for the Segments field.

xrotwang commented 3 years ago

@cormacanderson Yes, I think that might be a good solution. You can still preserve the full data in a separate column. For me, the litmus test for what should go into any of the CLDF-specified fields (such as segments) is "are there analysis tools making use of this". So we are not attempting to describe everything that could be encountered, but narrow it down to "everything we can make sense of in an automated fashion".

cormacanderson commented 3 years ago

@xrotwang yes that makes sense from a practical perspective. However, I would prefer if we can narrow that gap between what is encountered (not "could be" here, as this is not a hypothetical but occurs in the data and is valid) and what can be dealt with as far as possible. There is a danger that things that cannot "be made sense of in an automated fashion" are simply ignored, because we can't deal with them, even though being rare they are often interesting and also theoretically important.

In this case, I'm not too worried about that, as there is a practical workaround. Where there isn't though, and sometimes there just isn't, my view is that BIPA should do its best to capture everything that is encountered, to stop as much as possible falling into that gap.