piotrbajdek / lngcnv

linguistics: display pronunciation, translate between dialects, convert between orthographies; support for multiple languages: English, Latin, Polish, Quechua, Spanish, Tikuna
https://crates.io/crates/lngcnv
MIT License
18 stars 1 forks source link

cellar door has a velar plosive #4

Closed m4b closed 6 months ago

m4b commented 9 months ago

The output of:

--ipa --eng 'cellar door'

is:

ke̽lə̞ dɔ̈ː

But the k in IPA is (I believe) a https://en.wikipedia.org/wiki/Voiceless_velar_plosive which is a "k" sound, like in cut; this should be like an "s" sound, like snake. Similar words like celebrate, or cyborg also have this issue.

Very neat tool!

piotrbajdek commented 9 months ago

Thanks for bringing this issue to my attention! The IPA transcription for English does need some refinement. I'll be releasing version 1.9.0 of the program soon, which will include these improvements along with an additional pronunciation variant (Dallas, TX).

piotrbajdek commented 7 months ago

Update... I have been exceptionally busy recently, but I am now back to working on the program's development.

m4b commented 7 months ago

i should note that comparing results of this with e.g., python's phonemize has quite a lot of differences (for english, is all I tested)

piotrbajdek commented 7 months ago

This is correct. Firstly, it's important to note that the transcriptions generated by lngcnv are phonetic rather than phonemic, unlike what you typically find in dictionaries. Secondly, the tool is designed to generate IPA for full sentences, not just single words. For instance, the word "get" in "get out" may be transcribed differently from "get back," just as it occurs in natural speech. Thirdly, I aim to represent precisely-defined dialects, such as those spoken in specific cities in real life (based solely on recordings and the study of phonetics literature in my spare time), rather than conforming to theoretical "standards." However, I am developing all of this in my spare time, and English is just one of several languages included... This tool will be quite different from anything I've encountered before, for any language, not just English. However, especially for English, honestly speaking it will take several years to make this rock-solid. Once again, I apologize for the delay. Currently, I am working on adding an alpha version of the pronunciation from Dallas, TX, and correcting errors in the pronunciation from Canberra, Australia.

m4b commented 7 months ago

context: i am an amateur linguist at best

phonetic vs phonemic, til there is a distinction which I will need to understand :)

I'm glad you clarified it's intended uses, I have primarily been using it for single words, for my uses the results of phonemize were more amenable, with lngcnv being perhaps a bit too precise/sophisticated :) most other ipa tooling seems to even ignore the basic syllabic pauses/suprasegmentals and other prosody features, let alone some of the more subtle ipa sounds that your tool sometimes produces :)

Lastly, please don't feel the need to apologize for any delays, or even if you never fix anything, this is great stuff regardless, and was very pleased to find some rust tooling in this (oft neglected) space :)

piotrbajdek commented 7 months ago

Phonetic representation of sounds is much more detailed than phonemic representation. For example, the word "duck" can be pronounced [dɐ̠k] in Canberra and [dɜ̹k] in Dallas, but in dictionaries for both variants of English, it will be phonemically written as /dʌk/. For a non-native speaker of English, the phonemic version is sufficient to be understood, but one will sound like a foreigner forever—this is all you get in dictionaries of English and most/all available tools. On the other hand, with the phonetic representation, one can theoretically learn to pronounce like a native speaker from a given locality.

However, this requires spending a lot of time on recordings and literature, and all this is challenging to code given the irregularities of English orthography.

piotrbajdek commented 7 months ago

I'm delighted to hear that you find lngcnv useful! :) However, depending on your use case, Python's 'phonemize' or some other tool may be more suitable than lngcnv. This tool is designed to be more sophisticated than those tools typically used by language learners, for example.

piotrbajdek commented 7 months ago

One more thing: There's no universal or singularly correct way to use IPA. For example, [o̞] and [ɔ̝] may both represent the exact same vowel because lowering [o] and raising [ɔ] may both result in the same position of the vowel. Differences in the outputs between lngcnv and some other tools do not necessarily imply differences in pronunciation.

piotrbajdek commented 7 months ago

Some listing of aspects of the program that aren't very clear could help me improve the documentation. If you find anything that isn't clear, feel free to open an issue... ;)

piotrbajdek commented 7 months ago

I've created a new branch called 'dev'. I don't want to make changes directly to 'main' because Canonical's Snap Store would automatically release them. https://github.com/piotrbajdek/lngcnv/blob/dev/docs/images/help-image.png This version is unstable (it can change rapidly) and unfinished (as of today, around half of the Texan English sounds remain unconverted and identical to Australian English). However, from now on, users will have access to the newest features if they compile the code themselves. 😊

If the problem with 'cellar door' and similar words is resolved in 'dev', this issue can be closed. 😁