cmu-llab / wikihan

Creative Commons Zero v1.0 Universal
11 stars 1 forks source link

Unconverted ng -> ŋ in Hakka and Hokkien in the ipa data #1

Closed cuichenx closed 2 years ago

cuichenx commented 2 years ago

Description

There are some instances of ng in Hakka and Hokkien that should be converted to ŋ in the IPA data

To reproduce

Search for the string ng in the IPA dataset -- these should all be ŋ

kalvinchang commented 2 years ago

The issue is that we did not ung as a rhyme/final to the mapping table

kalvinchang commented 2 years ago

For example, Hakka lùng is analyzed by Epitran as l + ùn + g because "-un" as a final is in the table but not "-ung"

cuichenx commented 2 years ago

Thanks for investigating!

kalvinchang commented 2 years ago

Thanks for catching this bug :)

kalvinchang commented 2 years ago

We will need to bump the Epitran version after a new release is issued