cldf-clts / clts

Cross-Linguistic Transcription Systems
https://clts.clld.org
13 stars 3 forks source link

Poor readability of diacritics in some combinations #74

Closed cormacanderson closed 3 years ago

cormacanderson commented 3 years ago

The following came up in the latest PHOIBLE mapping.

1) Release diacritic: Raw grapheme p͉ʲ mapping was replaced from p̚ʲ to pʲ̚. Raw grapheme t͉ʲ mapping was replaced from t̚ʲ to tʲ̚. Raw grapheme k͉ʷ mapping was replaced from k̚ʷ to kʷ̚.

I think perhaps the release diacritic underneath might be most readable here, but for consistency would prefer the second option, e.g. t̚ʲ.

2) Diacritics on length marks: Raw grapheme ä̠ː mapping was replaced from to aː̈. Raw grapheme äː mapping was replaced from äː to aː̈. Raw grapheme a̟ː mapping was replaced from a̟ː to aː̟.

I think this is not to be preferred and would prefer ä, äː, a̟ː here. There are other instances of this elsewhere too.

3) devoicing diacritics: Raw grapheme d̥ʒ̥ mapping was replaced from dʒ̊ to dʒ̥. Raw grapheme d̥ʒ̊ mapping was replaced from dʒ̊ to dʒ̥.

I would be inclined here to retain the devoicing diacritic also on the stop component. Even though we don't do this with other cases of diacritics, I view this one as somewhat different.

LinguList commented 3 years ago

@cormacanderson, note that diacritics are tricky, as they are defined by the "write_order" we discussed. If they are problematic, it is for several reasons:

  1. the write order should be changed (!), as is the case for long ä, I assume (!), this relates to our pyclts code, and issues should be placed there
  2. the diacritic has two distinct placements in unicode, one on top, one above, but only one can be taken as the standard, so if the sound is then not defined as a base sound in CLTS, the algorithm will automatically compose the sound, as is the case for dʒ̥

The preferred way to handle the cases in 2 is to add the sounds explicitly to consonants.tsv (or vowels.tsv), since this is the final instance that can overwrite any problematic behavior, regardless of the composition. So if you add the dʒ̊ to consonants.tsv, this is the new order accepted.

cormacanderson commented 3 years ago

For the consonants this may be resolved then by https://github.com/cldf-clts/pyclts/issues/31.

I suppose then we need a new write order on vowels:

    _write_order = dict(
        pre=[],
        post=[
            'tongue_root', 'raising', 'centrality', 'rounding',
            'voicing', 'breathiness', 'creakiness',
            'syllabicity',  'frication', 'relative_articulation', 'nasalization', 'tone', 'articulation', 'rhotacization',
            'pharyngealization', 'glottalization', 'velarization', 'duration'])
    _name_order = [
        'duration', 'rhotacization', 'pharyngealization',
        'glottalization', 'velarization', 'syllabicity',
        'relative_articulation',
        'tongue_root', 'raising', 'rounding',
        'articulation', 'nasalization', 'voicing', 'creakiness',
        'breathiness', 'roundedness', 'height', 'frication', 'centrality',
        'tone']

I will open an issue then at https://github.com/cldf-clts/pyclts.

For the devoicing diacritics, maybe best leave it for now. I will close this issue.