Closed sadia95 closed 3 years ago
@sadia95 you can try pre-trained Quartznet model https://ngc.nvidia.com/catalog/models/nvidia:nemo:stt_de_quartznet15x5 Note, correct language name should be added https://github.com/NVIDIA/NeMo/blob/main/tools/ctc_segmentation/scripts/prepare_data.py#L46 for num2words to work.
The CTC Segmentation to split the audio files and its transcripts, is not working with German audios. I followed the tutorial on COLAB. Is there any other model to be used for German language?