A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
CTC finetuning is not converging I tried to change hyperparameters, but still there is no luck. During training the model started to output empty strings.
Its a very old model, I would recommend you to try with Fast Conformer.
@titu1994 Few notebooks are based on quartznet architecture, we need to update them to use FastConformer!
Describe the bug
CTC finetuning is not converging I tried to change hyperparameters, but still there is no luck. During training the model started to output empty strings.
Expected behavior
expected to converge for new language (I was simply following the tutorial given). https://github.com/NVIDIA/NeMo/blob/main/tutorials/asr/ASR_CTC_Language_Finetuning.ipynb
Environment overview
Environment details
Package Version
absl-py 2.1.0 accelerated-scan 0.2.0 addict 2.4.0 aiohttp 3.9.5 aiosignal 1.3.1 alabaster 0.7.16 alembic 1.13.1 aniso8601 9.0.1 antlr4-python3-runtime 4.9.3 anyio 4.4.0 argon2-cffi 23.1.0 argon2-cffi-bindings 21.2.0 arrow 1.3.0 asciitree 0.3.3 asteroid-filterbanks 0.4.0 asttokens 2.4.1 async-lru 2.0.4 async-timeout 4.0.3 attrdict 2.0.1 attrs 23.2.0 audioread 3.0.1 Babel 2.15.0 beautifulsoup4 4.12.3 black 24.4.2 bleach 6.1.0 boto3 1.34.113 botocore 1.34.113 braceexpand 0.1.7 causal-conv1d 1.2.2.post1 cdifflib 1.2.6 certifi 2024.2.2 cffi 1.16.0 charset-normalizer 3.3.2 click 8.0.2 clip 0.2.0 cloudpickle 3.0.0 colorama 0.4.6 colorlog 6.8.2 comm 0.2.2 contourpy 1.2.1 cycler 0.12.1 Cython 3.0.10 cytoolz 0.12.3 datasets 2.19.1 debugpy 1.8.1 decorator 5.1.1 decord 0.6.0 defusedxml 0.7.1 diart 0.9.0 diffusers 0.28.0 dill 0.3.8 Distance 0.1.3 docker-pycreds 0.4.0 docopt 0.6.2 docutils 0.21.2 dtw-python 1.4.2 editdistance 0.8.1 einops 0.8.0 einops-exts 0.0.4 exceptiongroup 1.2.0 executing 2.0.1 faiss-cpu 1.8.0 fasteners 0.19 fastjsonschema 2.19.1 fasttext 0.9.2 fiddle 0.3.0 filelock 3.14.0 Flask 2.2.5 Flask-RESTful 0.3.10 fonttools 4.51.0 fqdn 1.5.1 frozenlist 1.4.1 fsspec 2024.3.1 ftfy 6.2.0 future 1.0.0 g2p-en 2.1.0 gdown 5.2.0 gitdb 4.0.11 GitPython 3.1.43 graphviz 0.20.3 greenlet 3.0.3 grpcio 1.64.0 h11 0.14.0 h5py 3.11.0 httpcore 1.0.5 httpx 0.27.0 huggingface-hub 0.23.3 hydra-core 1.3.2 HyperPyYAML 1.2.2 idna 3.7 ijson 3.2.3 imageio 2.34.1 imagesize 1.4.1 importlib_metadata 7.1.0 inflect 7.2.1 iniconfig 2.0.0 inquirerpy 0.3.4 intervaltree 3.1.0 ipykernel 6.29.3 ipython 8.24.0 ipython-genutils 0.2.0 ipywidgets 8.1.3 isoduration 20.11.0 isort 5.13.2 itsdangerous 2.2.0 jedi 0.19.1 jieba 0.42.1 Jinja2 3.1.3 jiwer 2.5.2 jmespath 1.0.1 joblib 1.4.2 json5 0.9.25 jsonpointer 2.4 jsonschema 4.22.0 jsonschema-specifications 2023.12.1 julius 0.2.7 jupyter_client 8.6.2 jupyter_core 5.7.2 jupyter-events 0.10.0 jupyter-lsp 2.2.5 jupyter_server 2.14.0 jupyter_server_terminals 0.5.3 jupyterlab 4.2.1 jupyterlab_pygments 0.3.0 jupyterlab_server 2.27.2 jupyterlab_widgets 3.0.11 kaldi-python-io 1.2.2 kaldiio 2.18.0 kiwisolver 1.4.5 kornia 0.7.2 kornia_rs 0.1.3 latexcodec 3.0.0 lazy_loader 0.4 Levenshtein 0.22.0 lhotse 1.23.0 libcst 1.4.0 librosa 0.10.2 lightning 2.2.4 lightning-utilities 0.11.2 lilcom 1.7 llvmlite 0.42.0 loguru 0.7.2 lxml 5.2.2 Mako 1.3.3 Markdown 3.6 markdown-it-py 3.0.0 markdown2 2.4.13 MarkupSafe 2.1.5 marshmallow 3.21.2 matplotlib 3.8.4 matplotlib-inline 0.1.7 mdurl 0.1.2 mistune 3.0.2 more-itertools 10.2.0 mpmath 1.3.0 msgpack 1.0.8 multidict 6.0.5 multiprocess 0.70.16 mypy-extensions 1.0.0 nbclient 0.10.0 nbconvert 7.16.4 nbformat 5.10.4 nemo_text_processing 1.0.2 nemo_toolkit 2.0.0rc1 nerfacc 0.5.3 nest_asyncio 1.6.0 networkx 3.3 ninja 1.11.1.1 nltk 3.8.1 notebook 6.4.12 notebook_shim 0.2.4 numba 0.59.1 numcodecs 0.12.1 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.20.5 nvidia-nvjitlink-cu12 12.4.127 nvidia-nvtx-cu12 12.1.105 omegaconf 2.3.0 onnx 1.16.1 open-clip-torch 2.24.0 openai-whisper 20231117 OpenCC 1.1.6 optuna 3.6.1 overrides 7.7.0 packaging 24.0 pandas 2.2.2 pandocfilters 1.5.1 pangu 4.0.6.1 parameterized 0.9.0 parso 0.8.4 pathspec 0.12.1 pexpect 4.9.0 pfzy 0.3.4 pickleshare 0.7.5 pillow 10.3.0 pip 24.0 plac 1.4.3 platformdirs 4.2.1 pluggy 1.5.0 pooch 1.8.1 portalocker 2.8.2 primePy 1.3 progress 1.6 prometheus_client 0.20.0 prompt-toolkit 3.0.42 protobuf 4.25.3 psutil 5.9.8 ptyprocess 0.7.0 pure-eval 0.2.2 pyannote.audio 3.1.1 pyannote.core 5.0.0 pyannote.database 5.1.0 pyannote.metrics 3.2.1 pyannote.pipeline 3.0.1 pyarrow 16.1.0 pyarrow-hotfix 0.6 pybind11 2.12.0 pybtex 0.24.0 pybtex-docutils 1.0.3 pycparser 2.22 pydub 0.25.1 Pygments 2.17.2 pyloudnorm 0.1.1 PyMCubes 0.1.4 pynini 2.1.5 pyparsing 3.1.2 pypinyin 0.51.0 pypinyin-dict 0.8.0 PySocks 1.7.1 pytest 8.2.1 pytest-mock 3.14.0 pytest-runner 6.0.1 python-dateutil 2.9.0 python-json-logger 2.0.7 pytorch-lightning 2.2.4 pytorch-metric-learning 2.5.0 pytz 2024.1 PyYAML 6.0.1 pyzmq 26.0.3 rapidfuzz 2.13.7 referencing 0.35.1 regex 2024.4.28 requests 2.31.0 resampy 0.4.3 rfc3339-validator 0.1.4 rfc3986-validator 0.1.1 rich 13.7.1 rouge_score 0.1.2 rpds-py 0.18.1 ruamel.yaml 0.18.6 ruamel.yaml.clib 0.2.8 Rx 3.2.0 s3transfer 0.10.1 sacrebleu 2.4.2 sacremoses 0.1.1 safetensors 0.4.3 scikit-learn 1.4.2 scipy 1.13.0 semver 3.0.2 Send2Trash 1.8.3 sentence-transformers 3.0.0 sentencepiece 0.2.0 sentry-sdk 2.3.1 setproctitle 1.3.3 setuptools 69.5.1 shellingham 1.5.4 six 1.16.0 smmap 5.0.1 sniffio 1.3.1 snowballstemmer 2.2.0 sortedcontainers 2.4.0 sounddevice 0.4.6 soundfile 0.12.1 soupsieve 2.5 sox 1.5.0 soxr 0.3.7 speechbrain 1.0.0 Sphinx 7.3.7 sphinxcontrib-applehelp 1.0.8 sphinxcontrib-bibtex 2.6.2 sphinxcontrib-devhelp 1.0.6 sphinxcontrib-htmlhelp 2.0.5 sphinxcontrib-jsmath 1.0.1 sphinxcontrib-qthelp 1.0.7 sphinxcontrib-serializinghtml 1.1.10 SQLAlchemy 2.0.29 stack-data 0.6.2 sympy 1.12 tabulate 0.9.0 taming-transformers 0.0.1 tensorboard 2.16.2 tensorboard-data-server 0.7.2 tensorboardX 2.6.2.2 tensorstore 0.1.45 termcolor 2.4.0 terminado 0.18.1 text-unidecode 1.3 textdistance 4.6.2 texterrors 0.4.4 threadpoolctl 3.5.0 tiktoken 0.6.0 timm 1.0.3 tinycss2 1.3.0 tokenizers 0.19.1 tomli 2.0.1 toolz 0.12.1 torch 2.3.0 torch-audiomentations 0.11.1 torch-pitch-shift 1.2.4 torchaudio 2.3.0 torchdiffeq 0.2.3 torchmetrics 1.3.2 torchsde 0.2.6 torchvision 0.18.0 tornado 6.4 tqdm 4.66.2 traitlets 5.14.3 trampoline 0.1.2 transformers 4.40.2 trimesh 4.4.0 triton 2.3.0 typeguard 4.3.0 typer 0.12.3 types-python-dateutil 2.9.0.20240316 typing_extensions 4.11.0 tzdata 2024.1 uri-template 1.3.0 urllib3 2.2.1 wandb 0.17.0 wcwidth 0.2.13 webcolors 1.13 webdataset 0.2.86 webencodings 0.5.1 websocket-client 1.8.0 websocket-server 0.6.4 Werkzeug 3.0.3 wget 3.2 wheel 0.43.0 whisper-timestamped 1.15.4 widgetsnbextension 4.0.11 wrapt 1.16.0 xxhash 3.4.1 yarl 1.9.4 zarr 2.18.2 zipp 3.17.0
GPU: Tesla V100