unicode-org / inflection

code, data and documentation related to handling inflection problems
Other
0 stars 1 forks source link

Support metadata with vowel and consonant or pronunciation properties #18

Open grhoten opened 4 months ago

grhoten commented 4 months ago

There are several languages that change their choice of articles or prepositions depending on whether it starts or ends in a vowel or consonant. As an example, the word "apple" has IPA information for pronouncing words in Wikidata. The vowel and consonant properties can be derived from that information. Properly supporting the English indefinite article requires this information to handle all of the edge cases. For example, in English you say "an apple" and not "a apple". You can make default guesses with a UnicodeSet to check the base character being in "[aeiou]" for the front of the word, but you have to handle such edge cases with exceptions, such as "an LED light", or "a unicorn".