open-dict-data / ipa-dict

Monolingual wordlists with pronunciation information in IPA
https://open-dict-data.github.io/ipa-lookup/
MIT License
555 stars 86 forks source link

Add files via upload #13

Open TasseDeCafe opened 5 years ago

TasseDeCafe commented 5 years ago

Hi!

Sorry for not answering your other message yet. I'll answer tomorrow. I've been working on a way to generate a dictionary for Thai, and I succeeded today. :) The source is: http://thai-language.com

This dictionary claims to have around 70000 entries, but there are thousands of example sentences that I eliminated, so the final file only contains around 46000 words. That's a good start, because Thai (just like Vietnamese) isn't inflected, so that's a lot of words.

dohliam commented 5 years ago

@TasseDeCafe This is fantastic! Thanks so much. Just a few questions before merging:

  1. Is there a license for this data?
  2. I assume you generated the IPA automatically -- did you write your own script for this, or is there an existing tool that can convert Thai text to IPA?
  3. Am I correct in thinking that the diacritics over the vowels (e.g., , ɯ̂, ) indicate tones? Would it be possible to convert these to IPA tone letters (as in the Vietnamese dictionary)?