AI4Bharat / IndicWav2Vec

Pretraining, fine-tuning and evaluation scripts for Indic-Wav2Vec2
https://indicnlp.ai4bharat.org/indicwav2vec
MIT License
82 stars 28 forks source link

Finetuning using the weights provided won't work. Please try to reproduce it, you will get the error. #15

Closed raotnameh closed 2 years ago

raotnameh commented 2 years ago

AssertionError: Could not infer task type from {'_name': 'temp_sampled_audio_pretraining', 'data': '/workspace/ data/mucs/hindi/manifest/', 'max_sample_size': 250000, 'min_samplesize': 32000, 'normalize': False, 'sampling alpha': 0.7}. Available argparse tasks: dict_keys(['translation_multi_simple_epoch', 'speech_to_text', 'text_to _speech', 'translation', 'translation_lev', 'audio_pretraining', 'sentence_ranking', 'frm_text_to_speech', 'tra nslation_from_pretrained_xlm', 'speech_to_speech', 'legacy_masked_lm', 'cross_lingual_lm', 'translation_from_pr etrained_bart', 'multilingual_language_modeling', 'denoising', 'multilingual_masked_lm', 'language_modeling', ' hubert_pretraining', 'speech_unit_modeling', 'multilingual_translation', 'masked_lm', 'simul_speech_to_text', ' simul_text_to_text', 'audio_finetuning', 'multilingual_denoising', 'online_backtranslation', 'sentence_predicti on', 'semisupervised_translation', 'dummy_lm', 'dummy_masked_lm', 'dummy_mt']). Available hydra tasks: dict_key s(['translation', 'translation_lev', 'audio_pretraining', 'translation_from_pretrained_xlm', 'multilingual_lang uage_modeling', 'language_modeling', 'hubert_pretraining', 'speech_unit_modeling', 'masked_lm', 'simul_textto text', 'audio_finetuning', 'sentence_prediction', 'dummy_lm', 'dummy_masked_lm'])

raotnameh commented 2 years ago

import torch ckpt = torch.load('indicwav2vec-base.pt') ckpt['cfg']['task']['_name'] = 'audio_finetuning' torch.save(ckpt, 'indicwav2vec-base-ft.pt')