NVIDIA / DeepLearningExamples

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
13.38k stars 3.2k forks source link

Can we fine tune a fine tuned model for quartznet architecture? #1089

Open wiamfa opened 2 years ago

wiamfa commented 2 years ago

Hello , is there any pre-trained models for the french language? or Is there any possible way to fine-tune a fine-tuned model the second time?

thanks,

alancucki commented 2 years ago

Hi @wiamfa

there is a pre-trained model for French from the NeMo project.

You can use it in NeMo, or follow these steps to load it in DeepLearningExamples:

  1. Change the extension from .nemo to .tar.gz.
  2. After unpacking you'll find two files: model_weights.ckpt and model_config.yaml.
  3. Make a copy of configs/quartznet15x5_speedp-online-1.15_speca.yaml and replace labels with those from model_config.yaml.
  4. Rename the keys in model_weights.ckpt:
    import torch
    remap = lambda k: (k.replace('encoder.encoder', 'encoder.layers')
                   .replace('decoder.decoder_layers', 'decoder.layers')
                   .replace('conv.weight', 'weight'))
    ckpt = torch.load('model_weights.ckpt')
    torch.save({'state_dict': {remap(k): v for k, v in ckpt.items() if 'preproc' not in k}},
           'qn_15x5_french.pt')

    Voilà!

Is there any possible way to fine-tune a fine-tuned model the second time?

This should work fine too.