Closed soheiltehranipour closed 3 years ago
Hi @soheiltehranipour
Yes, we are working on a major redesigned version of NeMo. You can see it now in, a default, "main" branch. This version is easily PyTorch-compatible, introduces concepts of models and adopts PyTorch Lightning for training.
A great place to start making yourself familiar with this new version is to checkout our tutorials (which can be run on Colab, just don't forget to set runtime type to GPU)
We strongly recommend you switch to this new version of NeMo because this is what 1.0.0 version will be based on. If you need access to the old version - it can be found by v0.11.1 tag https://github.com/NVIDIA/NeMo/releases/tag/v0.11.1
I apologize for the inconvenience and hope you'll like new version. Thank you for your interest in NeMo!
Thanks a lot! Looking forward to more documentation for it.
The first link is dead though.
fixed, the link. thx
I just realized about the changes today, it so easy to migrate since I just need to load my model weights and I can use current version straight away based on Fine Tuning example, Thanks.
I just realized about the changes today, it so easy to migrate since I just need to load my model weights and I can use current version straight away based on Fine Tuning example, Thanks.
How exacly would I go about loading model weights from previous version and using. it for speech to text locally. (I built a custom manifesto using https://github.com/NVIDIA/NeMo/blob/master/examples/asr/notebooks/1_ASR_tutorial_using_NeMo.ipynb as reference) (I tried asking the same at https://github.com/NVIDIA/NeMo/issues/1057 in case this thread isn't the right place to have this answered)
If you use old nemo, it will save your Encoder and Decoder weights For the new one I just instantiate the model and then do something like this
quartznet = nemo_asr.models.EncDecCTCModel.from_pretrained(model_name="QuartzNet15x5Base-En") # when I think back I don't need to download the pretrained model since it will be replace anyway
enc = torch.load("sk_ckpt/JasperEncoder-STEP-810000.pt")
dec = torch.load("sk_ckpt/JasperDecoderForCTC-STEP-810000.pt")
quartznet.encoder.load_state_dict(enc)
quartznet.decoder.load_state_dict(dec)
del enc
del dec
And then follow the Transfer learning tutorial
from nemo.collections.asr.losses.ctc import CTCLoss
quartznet.loss = CTCLoss(num_classes=quartznet.decoder.num_classes_with_blank - 1,zero_infinity=True) # before this if don't have "zero_infinity=True" my loss will always be nan and the model not learn anything, not sure latest version already add this params
quartznet._train_dl.num_workers = 4
quartznet._validation_dl.num_workers = 4
trainer = pl.Trainer(gpus=[1], max_epochs=50,
amp_backend='native',
precision=16,
amp_level='O1',
val_check_interval=0.1,
accumulate_grad_batches=100
)
trainer.fit(quartznet)
Next time I want to continue fine tune my model I instantiate using this
quartznet = nemo_asr.models.EncDecCTCModel.load_from_checkpoint("lightning_logs/version_37/checkpoints/epoch=0.ckpt")
@okuchaiev Thanks for your efforts. I hope you are including 'word level timestamps' with and without using language model in ASR in this major release. If not, its a request to please add this feature as well. Also, is there any tentative date when 1.0.0 will be released?
Hello,
I am working on NeMo for about 2 months and yesterday I suddenly figured out that everything has been updated. Is that true? Many files are missing though...