Closed 6drf21e closed 3 months ago
p.s. 建议在代码中标明加载的json的出处。
Im curious on whether this hack has been verified to not affect the output quality by any means. By swapping words, you are scrambling up phrases into meaningless indiv characters. By right this should affect the linguistic ability of the model.
By right this should affect the linguistic ability of the model.
It is better than no sound if the model cannot recognize that character. We will remove (or shrink) this replace if the model is updated and it can recognize new characters.
The replacements only target incorrectly output characters. In modern novels, the replacement rate is low. For example, in "Twenty Thousand Leagues Under the Sea":
Original | Replacement | Count |
---|---|---|
鲛 | 教 | 43 |
鳃 | 塞 | 32 |
桅 | 维 | 27 |
舷 | 闲 | 22 |
颚 | 恶 | 16 |
呷 | 嘎 | 12 |
锨 | 先 | 12 |
獭 | 塔 | 8 |
岖 | 区 | 7 |
囱 | 聪 | 6 |
In ancient novels, the rate is higher. For example, in "Romance of the Three Kingdoms":
Original | Replacement | Count |
---|---|---|
郃 | 和 | 249 |
岱 | 带 | 189 |
惇 | 蹲 | 161 |
褚 | 楚 | 154 |
傕 | 觉 | 136 |
汜 | 四 | 134 |
讫 | 气 | 133 |
赍 | 机 | 119 |
綝 | 陈 | 109 |
瑁 | 帽 | 103 |
This PR addresses the issue of mispronounced words in the TTS system by implementing a hack method. The main idea is as follows:
For example:
The replacement rule file ChatTTS/homophones_map.json contains 16,000 entries.
Rule creation process:
Corpus used: Tencent AI Lab Embedding Corpora for Chinese and English Words and Phrases
Limitations:
中文:
本次PR通过使用hack办法解决TTS系统中读错、漏读的问题。主要的构思如下:
例如:
替换规则文件 ChatTTS/homophones_map.json 包含1.6万条规则。
规则制作流程:
当前版本所使用的词库:腾讯AI实验室中英文词语嵌入语料库
缺陷: