SpeechLM：KeyError: 'text_transformer' while initing the SpeechLMConfig

JunZhan2000 commented 1 year ago

Hi，I'm trying to use the script in README to Extract features using pre-trained models, I used the model speechlmp_base_asr_checkpoint_best.pt.But I encountered an error while initing the SpeechLMConfig：

Traceback (most recent call last): File "/remote-home/jzhan/SpeechT5/SpeechLM/test.py", line 7, in cfg = SpeechLMConfig(checkpoint['cfg']['model']) File "/SpeechT5/SpeechLM/SpeechLM.py", line 128, in init self.update(cfg) File "/SpeechT5/SpeechLM/SpeechLM.py", line 132, in update self.text_transformer = TransformerConfig(model_cfg['text_transformer']) KeyError: 'text_transformer'

Am I missing any model files?

zz12375 commented 1 year ago

Hello, @guokr233 For now, only pre-trained models could be directly used to extract features, you have 3 choices of the model as the readme provided, While the one you used, speechlmp_base_asr_checkpoint_best.pt, is a fine-tuned ASR model, which can not be directly loaded by SpeechT5/SpeechLM/SpeechLM.py.

The configuration descriptions saved in checkpoints are different between pre-trained models and fine-tuned models, that's why you got this error when loading fine-tuned models.
What's the difference between the models listed in the above figure and the models listed in the Pre-Trained and Fine-tuned Models Table? The 3 models listed in the above figure are made from the original checkpoints by copying parameters and removing unused modules.

Hope the above information helps you.

JunZhan2000 commented 1 year ago

I used the model mentioned in "Extract features using pre-trained models"，it worked，thank you very much！

microsoft / SpeechT5

SpeechLM：KeyError: 'text_transformer' while initing the SpeechLMConfig #26