Closed FredericBlum closed 1 year ago
@Lingulist You mentioned something about extracting concordances for this conversion, but I am not sure what you are referring to. Could you elaborate briefly?
I recommend reading our paper, List, Sims, Forkel 2020 on IGT for this purpose, where we mention this (Robert has developed tge package further by now).
pyigt
can be used to extract word/morpheme concordances, not phoneme concordances. So I don't think it's relevant for X-Sampa to CLTS conversion.
It depends on the corpus structure, I thought, we first get a concordance of words and then convert those to clts/bipa, with the typical orthoprofile procedure from pylexibank. Here, you woukd use a concordance to get those lexemes, right?
But if that is not the case, one needs to use segments directly, which changes the procedure of applying the profile.
Ah, ok. Yes, one could do that - although I wouldn't want to bring in all the pylexibank machinery in this repos. So maybe we should
Then, copy the profiles back here and add the CLTS conversion to the makecldf command.
Yes, sounds like a plan. We have a rather complete sampa profile. Need to look that up when I find time. It may be in the orthograpy repo...
It is https://github.com/orthograpy/orthograpy, if I am not mistaken.
For the transcription, all phones are currently in X-Sampa and need to be transfered to CLTS.