Closed m4b closed 6 months ago
Thanks for bringing this issue to my attention! The IPA transcription for English does need some refinement. I'll be releasing version 1.9.0 of the program soon, which will include these improvements along with an additional pronunciation variant (Dallas, TX).
Update... I have been exceptionally busy recently, but I am now back to working on the program's development.
i should note that comparing results of this with e.g., python's phonemize
has quite a lot of differences (for english, is all I tested)
This is correct. Firstly, it's important to note that the transcriptions generated by lngcnv are phonetic rather than phonemic, unlike what you typically find in dictionaries. Secondly, the tool is designed to generate IPA for full sentences, not just single words. For instance, the word "get" in "get out" may be transcribed differently from "get back," just as it occurs in natural speech. Thirdly, I aim to represent precisely-defined dialects, such as those spoken in specific cities in real life (based solely on recordings and the study of phonetics literature in my spare time), rather than conforming to theoretical "standards." However, I am developing all of this in my spare time, and English is just one of several languages included... This tool will be quite different from anything I've encountered before, for any language, not just English. However, especially for English, honestly speaking it will take several years to make this rock-solid. Once again, I apologize for the delay. Currently, I am working on adding an alpha version of the pronunciation from Dallas, TX, and correcting errors in the pronunciation from Canberra, Australia.
context: i am an amateur linguist at best
phonetic vs phonemic, til there is a distinction which I will need to understand :)
I'm glad you clarified it's intended uses, I have primarily been using it for single words, for my uses the results of phonemize were more amenable, with lngcnv being perhaps a bit too precise/sophisticated :) most other ipa tooling seems to even ignore the basic syllabic pauses/suprasegmentals and other prosody features, let alone some of the more subtle ipa sounds that your tool sometimes produces :)
Lastly, please don't feel the need to apologize for any delays, or even if you never fix anything, this is great stuff regardless, and was very pleased to find some rust tooling in this (oft neglected) space :)
Phonetic representation of sounds is much more detailed than phonemic representation. For example, the word "duck" can be pronounced [dɐ̠k] in Canberra and [dɜ̹k] in Dallas, but in dictionaries for both variants of English, it will be phonemically written as /dʌk/. For a non-native speaker of English, the phonemic version is sufficient to be understood, but one will sound like a foreigner forever—this is all you get in dictionaries of English and most/all available tools. On the other hand, with the phonetic representation, one can theoretically learn to pronounce like a native speaker from a given locality.
However, this requires spending a lot of time on recordings and literature, and all this is challenging to code given the irregularities of English orthography.
I'm delighted to hear that you find lngcnv useful! :) However, depending on your use case, Python's 'phonemize' or some other tool may be more suitable than lngcnv. This tool is designed to be more sophisticated than those tools typically used by language learners, for example.
One more thing: There's no universal or singularly correct way to use IPA. For example, [o̞] and [ɔ̝] may both represent the exact same vowel because lowering [o] and raising [ɔ] may both result in the same position of the vowel. Differences in the outputs between lngcnv and some other tools do not necessarily imply differences in pronunciation.
Some listing of aspects of the program that aren't very clear could help me improve the documentation. If you find anything that isn't clear, feel free to open an issue... ;)
I've created a new branch called 'dev'. I don't want to make changes directly to 'main' because Canonical's Snap Store would automatically release them. https://github.com/piotrbajdek/lngcnv/blob/dev/docs/images/help-image.png This version is unstable (it can change rapidly) and unfinished (as of today, around half of the Texan English sounds remain unconverted and identical to Australian English). However, from now on, users will have access to the newest features if they compile the code themselves. 😊
If the problem with 'cellar door' and similar words is resolved in 'dev', this issue can be closed. 😁
The output of:
is:
But the
k
in IPA is (I believe) a https://en.wikipedia.org/wiki/Voiceless_velar_plosive which is a "k" sound, like incut
; this should be like an "s" sound, likesnake
. Similar words likecelebrate
, orcyborg
also have this issue.Very neat tool!