itinerarium / phoneme-synthesis

A browser-based tool to convert International Phonetic Alpha (IPA) phonetic notation to speech using the meSpeak.js package
GNU General Public License v3.0
254 stars 40 forks source link

Character escape sequences #14

Open bripmccann opened 5 years ago

bripmccann commented 5 years ago

The character mappings mostly represent Unicode and hexadecimal characters with escape sequences rather than the characters themselves. (E.g., æ is represented by \xe6 and ʌ is represented by \u028c)

But there are non-escaped, non-ASCII IPA characters in the mappings under // edits arising from testing. The same is true for all the instances of /mʊmˈbaɪ/ in the HTML.

Are escape sequences helpful here? If so, should these unescaped characters be converted?

If not, could all the characters be unescaped? Testing locally, it seems to work fine that way. And it would make the code more readable.

ssb22 commented 4 years ago

The escape sequences came from Lexconvert, and the reason why I wrote them as escape sequences is I was using an editor on an ASCII-only terminal at the time. I see no reason not to convert them all to UTF-8 if we can guarantee that UTF-8 will be used as the character set for the script.