skywind3000 / ECDICT

Free English to Chinese Dictionary Database
MIT License
5.88k stars 1.03k forks source link

separate pronunciations by part of speech #93

Open garfieldnate opened 2 years ago

garfieldnate commented 2 years ago

Thank you for this great project!

I have one issue with the structure of the CSV file. I expect there to be duplicates in the word column. Why? Because different words can have the same spelling. For example there are many Latin words that are spelled the same but pronounced differently depending on whether they are used as a verb or a noun. Some examples:

spelling noun pronunciation verb pronunciation
produce 'prәudjus prә'dju:s
estimate 'estimәt 'estimeit
object 'әb'dʒekt 'ɒbdʒekt
proceeds 'prәusi:dz prә'si:dz
protest 'prәutest prә'test

Also, "micrometer" the tool is pronounced mai'krɒmitә but the unit of measurement is pronounced mai'krәumi:tә.

skywind3000 commented 2 years ago

It makes sense to have multiple pronunciation information.

But sorry, out of my capability.

I've created this project in my spare time. right now I don't have enough time and resource to complete this.

garfieldnate commented 2 years ago

No worries, I'm not asking you to fix it pronto :) You can just leave this ticket here for others to see. If someone is interested in working on this, the data is available from wiktionary.

skywind3000 commented 2 years ago

OK, fair