视频源音频为英语，想转成中文时，合成后，音频基本上听不了

hwangato commented 1 year ago

一直提示“[!] Character '我' not found in the vocabulary. Discarding it. ”，是否有没有中文词典？所以没有合成语音？请帮忙解答一下，谢谢

以下为运行命令及日志：

python translate.py it_cut.mp4 chinese -o it-cn.mp4

tts_models/hak/fairseq/vits is already downloaded. Setting up Audio Processor... | > sample_rate:22050 | > resample:False | > num_mels:80 | > log_func:np.log10 | > min_level_db:0 | > frame_shift_ms:None | > frame_length_ms:None | > ref_level_db:None | > fft_size:1024 | > power:None | > preemphasis:0.0 | > griffin_lim_iters:None | > signal_norm:None | > symmetric_norm:None | > mel_fmin:0 | > mel_fmax:None | > pitch_fmin:None | > pitch_fmax:None | > spec_gain:20.0 | > stft_pad_mode:reflect | > max_norm:1.0 | > clip_norm:True | > do_trim_silence:False | > trim_db:60 | > do_sound_norm:False | > do_amp_to_db_linear:True | > do_amp_to_db_mel:True | > do_rms_norm:False | > db_level:None | > stats_path:None | > base:10 | > hop_length:256 | > win_length:1024 Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.1.1. To apply the upgrade to your files permanently, run python -m pytorch_lightning.utilities.upgrade_checkpoint ../../.cache/torch/whisperx-vad-segmentation.bin Model was trained with pyannote.audio 0.0.1, yours is 3.0.0. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.10.0+cu102, yours is 2.1.0. Bad things might happen unless you revert torch to 1.x. Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.1.1. To apply the upgrade to your files permanently, run python -m pytorch_lightning.utilities.upgrade_checkpoint ../../.cache/torch/pyannote/models--pyannote--segmentation/snapshots/c4c8ceafcbb3a7a280c2d357aee9fbc9b0be7f9b/pytorch_model.bin Model was trained with pyannote.audio 0.0.1, yours is 3.0.0. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.10.0+cu102, yours is 2.1.0. Bad things might happen unless you revert torch to 1.x. onnx load done onnx load done Processing: 100%|█████████████████████████████████| 2/2 [00:28<00:00, 14.28s/it] Detected language: en (1.00) in first 30s of audio... VideoManager is deprecated and will be removed. base_timecode argument is deprecated and has no effect. Face detector [scene_id: 1]: 165it [00:25, 6.52it/s] Face detector [scene_id: 2]: 104it [00:17, 5.88it/s] Face detector [scene_id: 3]: 126it [00:05, 24.59it/s] Face detector [scene_id: 4]: 137it [00:24, 5.58it/s] Face detector [scene_id: 5]: 83it [00:02, 30.47it/s] Face detector [scene_id: 6]: 124it [00:22, 5.44it/s] Face detector [scene_id: 7]: 105it [00:04, 24.33it/s] Face detector [scene_id: 8]: 79it [00:16, 4.73it/s] Text splitted to sentences. ['我是來自 FreeCodeCamp.org 的 Beau Carnes，在本課程中，我將向您展示如何使用 AI 來簡化基礎架構和網站的部署。'] 我是來自 freecodecamp.org 的 beau carnes，在本課程中，我將向您展示如何使用 ai 來簡化基礎架構和網站的部署。 [!] Character '我' not found in the vocabulary. Discarding it. 我是來自 freecodecamp.org 的 beau carnes，在本課程中，我將向您展示如何使用 ai 來簡化基礎架構和網站的部署。 [!] Character '是' not found in the vocabulary. Discarding it. 我是來自 freecodecamp.org 的 beau carnes，在本課程中，我將向您展示如何使用 ai 來簡化基礎架構和網站的部署。 [!] Character '來' not found in the vocabulary. Discarding it. 我是來自 freecodecamp.org 的 beau carnes，在本課程中，我將向您展示如何使用 ai 來簡化基礎架構和網站的部署。

hwangato commented 1 year ago

机器为mac m2 max

BrasD99 commented 1 year ago

这是我正在处理的一个实际问题。我的朋友Trueto正在研究我的存储库的一个分支，目前你可以参考它。但很快我也会在这个存储库中修复这个错误！

https://github.com/AIFSH/MyHeyGen

BrasD99 / HeyGenClone

视频源音频为英语，想转成中文时，合成后，音频基本上听不了 #9