NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Apache License 2.0
11.84k stars 2.46k forks source link

[Question] How to generate my custom .nemo file from a fine-tuned model #1265

Closed nikhil-salodkar closed 4 years ago

nikhil-salodkar commented 4 years ago

Describe your question

Using the speech_to_text.py script provided in examples I was able to fine-tune the Quartznet15*15 model using a new dataset which generated new files including checkpoints in default generated "nemo_experiments" folder. Now I want to create a .nemo file from these newly generated checkpoint files and have them evaluated using the already script provided in examples "speech_to_text_infer.py" I generated a .nemo tar file with the final weight files and new hparams.yaml file that was generated after training. But, when I try to run the EncDecCTCModel.restore_from(restore_path='re-trained-quartznet.nemo', override_config_path='/workspace/nemo/custom_asr/notebooks/nemo_experiments/QuartzNet15x5/2020-10-01_07-53-04/hparams.yaml') where hparams.yaml is the newly generated yaml file I get runtime error for loading the weights : RuntimeError: Error(s) in loading state_dict for EncDecCTCModel: So, now the question is which model_config.yaml file needs to be used to create a custom .nemo file and is the code provided in speech_to_text_infer.py can handle .nemo files fine-tuned using script provided in speech_to_text.py file?

Environment overview (please complete the following information)

Other things like evaluation using pretrained model and training model from scratch is working fine.

nikhil-salodkar commented 4 years ago

Figured it out myself. The info is given in the very first tutorial jupyter notebook of NeMo.

Ashbajawed commented 3 years ago

@nikhil-salodkar can you breifly explain how you convert .pt file into nemo? also which one I'm getting 3 pt files encoder, decoder and trainer