NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Apache License 2.0
11.95k stars 2.49k forks source link

How to migrate a Quartznet 15x5 model from NeMo 0.10.1 to 1.0.0b3? #1607

Closed AlexeiSLazarev closed 3 years ago

AlexeiSLazarev commented 3 years ago

Greetings! I've got 4 files of a Quartznet 15x5 model: JasperDecoderForCTC-STEP-520000.pt JasperEncoder-STEP-520000.pt lm.binary quartznet15x5_ru.yaml The model was trained using NeMo 0.10.1.

I want to fine tune this model using NeMo 1.0.0b3. Is there any script to convert a model from old versions to the new format? I've compared .yaml files format of 0.10.1 and 1.0.0b3 and it seemed to be completely different.

Environment overview (please complete the following information) Docker container nvcr.io/nvidia/nemo:1.0.0b3

Additional context Add any other context about the problem here. GPU: Nvidia GTX 980ti

okuchaiev commented 3 years ago

You can try this script https://github.com/NVIDIA/NeMo/blob/main/scripts/asr_checkpoint_port.py

AlexeiSLazarev commented 3 years ago

When i run the command: python ./asr_checkpoint_port.py --config_path='/workspace/nemo/quartznet_ru/quartznet15x5_ru.yaml' --encoder_ckpt='/workspace/nemo/quartznet_ru/JasperEncoder-STEP-520000.pt' --decoder_ckpt='/workspace/nemo/quartznet_ru/JasperDecoderForCTC-STEP-520000.pt' --output_path='/workspace/nemo/quartznet_ru/quartznet15x5_ru.nemo'

I've got an error: ModuleNotFoundError: No module named 'ruamel'

Changing from ruamel.yaml import YAML to from _ruamelyaml import YAML inside asr_checkpoint_port.py resolved this error.

But got another one (copy and paste full trace): root@4506e32dbac2:/workspace/nemo/scripts# python ./asr_checkpoint_port.py --config_path='/workspace/nemo/quartznet_ru/quartznet15x5_ru.yaml' --encoder_ckpt='/workspace/nemo/quartznet_ru/JasperEncoder-STEP-520000.pt' --decoder_ckpt='/workspace/nemo/quartznet_ru/JasperDecoderForCTC-STEP-520000.pt' --output_path='/workspace/nemo/quartznet_ru/quartznet15x5_ru.nemo' [NeMo W 2021-01-07 10:00:50 experimental:28] Module nemo.collections.asr.data.audio_to_text.AudioToCharDataset is experimental, not ready for production and is not fully supported. Use at your own risk. [NeMo W 2021-01-07 10:00:50 experimental:28] Module nemo.collections.asr.data.audio_to_text.AudioToBPEDataset is experimental, not ready for production and is not fully supported. Use at your own risk. [NeMo W 2021-01-07 10:00:50 experimental:28] Module nemo.collections.asr.data.audio_to_text.AudioLabelDataset is experimental, not ready for production and is not fully supported. Use at your own risk. [NeMo W 2021-01-07 10:00:50 experimental:28] Module nemo.collections.asr.data.audio_to_text.TarredAudioToTextDataset is experimental, not ready for production and is not fully supported. Use at your own risk. [NeMo W 2021-01-07 10:00:50 experimental:28] Module nemo.collections.asr.data.audio_to_text.TarredAudioToCharDataset is experimental, not ready for production and is not fully supported. Use at your own risk. [NeMo W 2021-01-07 10:00:50 experimental:28] Module nemo.collections.asr.data.audio_to_text.TarredAudioToBPEDataset is experimental, not ready for production and is not fully supported. Use at your own risk. [NeMo W 2021-01-07 10:00:50 experimental:28] Module <class 'nemo.collections.asr.losses.ctc.CTCLoss'> is experimental, not ready for production and is not fully supported. Use at your own risk. [NeMo W 2021-01-07 10:00:50 experimental:28] Module <class 'nemo.collections.asr.data.audio_to_text_dali.AudioToCharDALIDataset'> is experimental, not ready for production and is not fully supported. Use at your own risk.

WARNING, path does not exist: KALDI_ROOT=/mnt/matylda5/iveselyk/Tools/kaldi-trunk (please add 'export KALDI_ROOT=' in your $HOME/.profile) (or run as: KALDI_ROOT= python .py)

[NeMo I 2021-01-07 10:00:50 asr_checkpoint_port:52] Creating ASR NeMo 1.0 model Traceback (most recent call last): File "./asr_checkpoint_port.py", line 71, in main(args.config_path, args.encoder_ckpt, args.decoder_ckpt, args.output_path, args.model_type) File "./asr_checkpoint_port.py", line 53, in main model = nemo_asr.models.EncDecCTCModel(cfg=DictConfig(params['model'])) File "/opt/conda/lib/python3.6/site-packages/omegaconf/dictconfig.py", line 81, in init self._set_value(content) File "/opt/conda/lib/python3.6/site-packages/omegaconf/dictconfig.py", line 549, in _set_value raise ValidationError(msg=msg) # pragma: no cover omegaconf.errors.ValidationError

okuchaiev commented 3 years ago

looks like ruamel isn't properly installed in your environment

loganlebanoff commented 3 years ago

I'm getting the same error of "pragma: no cover omegaconf.errors.ValidationError" and "Unsupported value type : QuartzNet"

loganlebanoff commented 3 years ago

I figured out the problem -- I changed the yaml file to be like this: https://github.com/NVIDIA/NeMo/blob/main/examples/asr/conf/quartznet_15x5.yaml

I believe I was using an old version of the yaml file that includes "AudioToTextDataLayer", "AudioToMelSpectrogramPreprocessor", "JasperEncoder", etc.