Closed cormacanderson closed 3 years ago
This is giving me headaches. What I do now in my concrete data is that I represent vocalic nasals as follows, depending on their (known) origin and phonotactics:
n̩ -> n̩/n Ø/ə
This means: I preserve information, but make a fake vowel in there.
The problem is that you typically align those in a complex manner, so it is useful to represent them not as normal vowels either, so the following looks awkward:
- n̩ -
n a
- ã
- a n
So for concrete practice: make a different representation, using the slash construct. For the future: separate tone and segment information, as a rule. This is what I will do in the future in most cases: tone is represented in an additional tier.
Okay, fair enough. The primary use case for me at the current task is to check if the transcriptions are IPA compliant, which these sonorants with tone actually are. I suppose I just use the orthography profile to change the representation for the Segments field.
@cormacanderson Yes, I think that might be a good solution. You can still preserve the full data in a separate column. For me, the litmus test for what should go into any of the CLDF-specified fields (such as segments
) is "are there analysis tools making use of this". So we are not attempting to describe everything that could be encountered, but narrow it down to "everything we can make sense of in an automated fashion".
@xrotwang yes that makes sense from a practical perspective. However, I would prefer if we can narrow that gap between what is encountered (not "could be" here, as this is not a hypothetical but occurs in the data and is valid) and what can be dealt with as far as possible. There is a danger that things that cannot "be made sense of in an automated fashion" are simply ignored, because we can't deal with them, even though being rare they are often interesting and also theoretically important.
In this case, I'm not too worried about that, as there is a practical workaround. Where there isn't though, and sometimes there just isn't, my view is that BIPA should do its best to capture everything that is encountered, to stop as much as possible falling into that gap.
I have data with tone marked on (syllabic) sonorants. The data I'm looking at is from Serbo-Croat, but apparently this is possible elsewhere: (from Wikipedia) "Tone is most frequently manifested on vowels, but in most tonal languages where voiced syllabic consonants occur they will bear tone as well. This is especially common with syllabic nasals, for example in many Bantu and Kru languages, but also occurs in Serbo-Croatian. " Once we put a syllabic marker on a sonorant, it puts it in the nucleus, where it's little wonder it can act like a vowel.
I don't really see any way of dealing with these cases under the current system. Tone is, naturally enough, dealt with as a vocalic feature in https://github.com/cldf-clts/clts/blob/master/pkg/transcriptionsystems/bipa/diacritics.tsv. I'm not suggesting adding it also to consonants, but the reality is that the syllabic marker means that the sonorant in question isn't really a conventional consonant any more.
Have you any suggestions @LinguList