GeorgeS2019 commented 11 months ago

Possible to support German language

kaiidams commented 11 months ago

NeMo provides German models. Writing phonemizers/tokenizers for German should not be difficult.

GeorgeS2019 commented 11 months ago

@kaiidams I went through NeMo, I could not find how German is supported. Any link would really appreciate.

kaiidams commented 11 months ago

This page lists available models https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/tts/checkpoints.html

If you have NeMo installed. You can run

from nemo.collections.tts.models.base import SpectrogramGenerator, Vocoder
from nemo.collections.asr.models import EncDecCTCModel
SpectrogramGenerator.list_available_models()
Vocoder.list_available_models()
EncDecCTCModel.list_available_models()

to get lists.

[PretrainedModelInfo(
    pretrained_model_name=QuartzNet15x5Base-En,
    description=QuartzNet15x5 model trained on six datasets: LibriSpeech, Mozilla Common Voice (validated clips from en_1488h_2019-12-10), WSJ, Fisher, Switchboard, and NSC Singapore English. It was trained with Apex/Amp optimization level O1 for 600 epochs. The model achieves a WER of 3.79% on LibriSpeech dev-clean, and a WER of 10.05% on dev-other. Please visit https://ngc.nvidia.com/catalog/models/nvidia:nemospeechmodels for further details.,
    location=https://api.ngc.nvidia.com/v2/models/nvidia/nemospeechmodels/versions/1.0.0a5/files/QuartzNet15x5Base-En.nemo
 ),
 PretrainedModelInfo(
    pretrained_model_name=stt_en_quartznet15x5,
    description=For details about this model, please visit https://ngc.nvidia.com/catalog/models/nvidia:nemo:stt_en_quartznet15x5,
    location=https://api.ngc.nvidia.com/v2/models/nvidia/nemo/stt_en_quartznet15x5/versions/1.0.0rc1/files/stt_en_quartznet15x5.nemo
 ),
 PretrainedModelInfo(
...

GeorgeS2019 commented 11 months ago

Any suggestion which pair to use? If I understand correctly?

I will read more and come back. Thank you

German

For German STT

https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_de_quartznet15x5 ```cs string modelPath = await DownloadModelAsync("stt_de_quartznet15x5"); ```

Todo

[ ] phonemizers for German
[ ] tokenizers for German

Writing phonemizers/tokenizers for German should not be difficult.

For German TTS

Mel-Spectrogram Generators

Vocoders

English

For English STT

https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_quartznet15x5 ```cs string modelPath = await DownloadModelAsync("stt_en_quartznet15x5"); ```

For English TTS

```cs string phonemeDict = await DownloadModelAsync("cmudict-0.7b_nv22.10"); string heteronyms = await DownloadModelAsync("heteronyms-052722"); string specGenModelPath = await DownloadModelAsync("tts_en_fastpitch"); string vocoderModelPath = await DownloadModelAsync("tts_en_hifigan"); ```

kaiidams / NeMoOnnxSharp

Possible to support German language? #20

German

Todo

Mel-Spectrogram Generators

Vocoders

English