Should UW be included in the phoneme set?
It seems g2p.phonemes operates under the general rule of of excluding the 'parent' category when its variants exist. For example, AA is not included since its variants AA0, AA1, AA2 are in the set. Same for AE, AH, AW, AY, etc. But UW seems to be the only exception. Furthermore, when I do simple frequency analyses on sizable corpora (not super rigorously though), UW never occurs while its variants do. I wonder if the phoneme set can safely forgo UW.
Should UW be included in the phoneme set? It seems g2p.phonemes operates under the general rule of of excluding the 'parent' category when its variants exist. For example, AA is not included since its variants AA0, AA1, AA2 are in the set. Same for AE, AH, AW, AY, etc. But UW seems to be the only exception. Furthermore, when I do simple frequency analyses on sizable corpora (not super rigorously though), UW never occurs while its variants do. I wonder if the phoneme set can safely forgo UW.