cldf-clts / clts

Cross-Linguistic Transcription Systems
https://clts.clld.org
13 stars 3 forks source link

Final checks on Transcription Data for V.2.0 #79

Closed LinguList closed 3 years ago

LinguList commented 3 years ago

In my opinion, this can be done after a pre-release, but it is important:

I'll go through these later.

cormacanderson commented 3 years ago

Have you already done these? If not, I can go through them now. The problem with pʲʰ is weird and also in eurasian.

LinguList commented 3 years ago

If you modify them, I'd later check with the code and see that this is accepted (I hope we won't have a problem with parsing pʲʰ, but if so, I will fix it in our code base).

The best way to proceed is to eddit these small things directly online on github, and then inform me.

cormacanderson commented 3 years ago

I was going through the .tsv files. Would that work too, or more complicated?

I've also found a few errors in https://github.com/cldf-clts/clts/blob/master/pkg/transcriptionsystems/bipa/consonants.tsv. Will I post them here or open a new issue?

LinguList commented 3 years ago

Editing them directly on github is less work for me, so my preferred variant. But whether you work with a whole file or with the github editor does not matter much, you can also work offline and then open with the github text editor, paste the file, and submit. We will have a detailed description of all lines that changed anyway, and you have some check on column numbers. 

You know how to edit files in github online, right?

cormacanderson commented 3 years ago

Never done it before, no, but can try to figure it out.

Will I change consonants.tsv that way too?

LinguList commented 3 years ago

Yes, consonants.tsv as well. What you essentially do is click on teh right-top, where there is a button saying "edit the file". The problem are tab-stops: you need to copy-paste them. But you can a) preview the file, and b) you can always leave a summary message when you submit, so we can easily find this in the log-book that git creates for us, and look at most recent changes.

cormacanderson commented 3 years ago

Yes, I figured it out :). Have set up two PRs and assigned you to review. Is this okay?

cormacanderson commented 3 years ago

With eurasian I think I automatically committed by accident...

tresoldi commented 3 years ago

Should we attribute each task to somebody? I think I could take at least of those involving saphon, panhon, and sala.

LinguList commented 3 years ago

I will do all of this now. It is better if only one person looks into this.

cormacanderson commented 3 years ago

I could also do them. After all I did the others and it won't take long. Maybe more consistent tht way. I can assign you again to review the PR @LinguList ?

LinguList commented 3 years ago

Yes, @cormacanderson, let us proceed in this way.

LinguList commented 3 years ago

I will evaluate the modifications now, I am currently writing a final application to check the data in sources/.

cormacanderson commented 3 years ago

Okay, I've been through these now.

We have the weird rejection of pʲʰ in SAPHON too. There is nothing else can be resolved there — the aspirated vowels have disappeared from the most recent version.

While it is a legitimate sound, I don't see any way of representing t̪ʙ with the current feature set. @tresoldi have you any ideas?

tresoldi commented 3 years ago

But CLTS/BIPA is accepting pʲʰ:

pʲʰ
palatalized aspirated voiceless bilabial stop consonant

I probably did something wrong while mapping, even though I cannot guess what as the other palatalized aspirated stops were accepted.

As for /t̪ʙ/, the only way (unless we add a "bilabially-post-trilled" feature which makes no real sense) would be to accept it as a cluster by changing pyclts here: https://github.com/cldf-clts/pyclts/blob/5b1547ecbdcf8777cda263fe219d8914241cb8eb/src/pyclts/transcriptionsystem.py#L219

We had all the discussions about keeping clusters restricted also in order not to over-populated generated sounds, maybe we add an exception just for this one?

LinguList commented 3 years ago

Never mind, @tresoldi, @cormacanderson is looking into this now, so don't worry.

LinguList commented 3 years ago

For the /t̪ʙ/, we leave it unmapped then.