as-ideas / DeepPhonemizer

Grapheme to phoneme conversion with deep learning.
MIT License
352 stars 38 forks source link

Heteronym problem #21

Closed EdwinYam closed 2 years ago

EdwinYam commented 2 years ago

Hello, thanks for your great work! I found that dp.phonemizer cannot handle heteronym problems well.

For example:

"We create the new record in the recording room"

turns into

"[W][IY] [K][R][IY][EY][T] [DH][AH] [N][UW] [R][AH][K][AO][R][D] [IH][N] [DH][AH] [R][AH][K][AO][R][D][IH][NG] [R][UW][M]"

while record should be [R], [EH], [K], [ER], [D]

Is there any suggestion? Thanks

ionite34 commented 2 years ago

Essentially the only way to differentiate the correct case is with part of speech information, something that is outside of the scope of the g2p model.

You can optionally train with this additional parameter but in my opinion is kind of pointless as English only has a finite defined set of heteronyms which can be manually replaced after doing part of speech tagging.

I have an implementation combining heteronym parsing with deep phonemizer for an example implementation: https://github.com/ionite34/Aquila-Resolve

EdwinYam commented 2 years ago

Essentially the only way to differentiate the correct case is with part of speech information, something that is outside of the scope of the g2p model.

You can optionally train with this additional parameter but in my opinion is kind of pointless as English only has a finite defined set of heteronyms which can be manually replaced after doing part of speech tagging.

I have an implementation combining heteronym parsing with deep phonemizer for an example implementation: https://github.com/ionite34/Aquila-Resolve

Thanks a lot for your great work! I will take a deep dive into your repo