ziz19 / Speech-Enhancement-for-noisy-ASR

project for spoken language technology
1 stars 0 forks source link

help required for fine tune model "speechbrain/asr-transformer-aishell" #1

Open jiyzhang opened 3 years ago

jiyzhang commented 3 years ago

Hi,

Could you please give me a hand on how to fine tune the model "speechbrain/asr-transformer-aishell". I've checked the tutorial "Pretrained Models and Fine-Tuning with Huggingface" in colab and followed the steps to fine tune "asr-transformer-aishell" model. What confused me is the model initialization, what kind of information should I provide to load the pretrained model?

for example, for pretrained model, "speechbrain/asr-crdnn-rnnlm-librispeech", we provided the parameters modules and hparms,

`modules = {"enc": asr_model.modules.encoder.model, "emb": asr_model.hparams.emb, "dec": asr_model.hparams.dec, "compute_features": asr_model.modules.encoder.compute_features, # we use the same features "normalize": asr_model.modules.encoder.normalize, "seq_lin": asr_model.hparams.seq_lin,

      }

hparams = {"seq_cost": lambda x, y, z: speechbrain.nnet.losses.nll_loss(x, y, z, label_smoothing = 0.1), "log_softmax": speechbrain.nnet.activations.Softmax(apply_log=True)}

brain = EncDecFineTune(modules, hparams=hparams, opt_class=lambda x: torch.optim.SGD(x, 1e-5)) brain.tokenizer = asr_model.tokenizer`

but I don't know what kind of "modules" and "hparams" should I define? Could you please tell me how to find information to define them?

Thanks in advance!

ziz19 commented 3 years ago

Hi, I would suggest looking at their ASR tutorial folder on the GitHub. There’s a train.yaml which defines the required modules and params. There is also a [related ASR tutorial on collab] (https://colab.research.google.com/drive/1aFgzrUv3udM_gNJNUoLaHIm78QHtxdIz?usp=sharing), which I think can provide an insight on how these modules and params are used.

jiyzhang commented 3 years ago

I tried to setup the modules in the way you did modules = {"CNN": asr_model.modules.asr_model[0], "Transformer": asr_model.modules.asr_model[1], "seq_lin": asr_model.modules.asr_model[2], "ctc_lin": asr_model.modules.asr_model[3], "env_corrupt": asr_model.modules.asr_model[4], "compute_features": asr_model.modules.encoder.compute_features, # we use the same features "normalize": asr_model.modules.encoder.normalize, }

But when I ran, it gave the error:

Traceback (most recent call last): File "/root/sb/finetune2.py", line 381, in <module> modules = {"CNN": asr_model.modules.asr_model[0], File "/root/anaconda3/envs/sb/lib/python3.9/site-packages/torch/nn/modules/module.py", line 947, in __getattr__ raise AttributeError("'{}' object has no attribute '{}'".format( AttributeError: 'ModuleDict' object has no attribute 'asr_model'

Per the hparams file in the pretrained model, the air_model is defined as:

`asr_model: !new:torch.nn.ModuleList

This drive me crazy, could you please give me a hand on this? You can find the pretrained model in https://huggingface.co/speechbrain/asr-transformer-aishell

Thanks!

ziz19 commented 3 years ago

I tried to setup the modules in the way you did modules = {"CNN": asr_model.modules.asr_model[0], "Transformer": asr_model.modules.asr_model[1], "seq_lin": asr_model.modules.asr_model[2], "ctc_lin": asr_model.modules.asr_model[3], "env_corrupt": asr_model.modules.asr_model[4], "compute_features": asr_model.modules.encoder.compute_features, # we use the same features "normalize": asr_model.modules.encoder.normalize, }

But when I ran, it gave the error:

Traceback (most recent call last): File "/root/sb/finetune2.py", line 381, in <module> modules = {"CNN": asr_model.modules.asr_model[0], File "/root/anaconda3/envs/sb/lib/python3.9/site-packages/torch/nn/modules/module.py", line 947, in __getattr__ raise AttributeError("'{}' object has no attribute '{}'".format( AttributeError: 'ModuleDict' object has no attribute 'asr_model'

Per the hparams file in the pretrained model, the air_model is defined as:

asr_model: !new:torch.nn.ModuleList - [!ref <CNN>, !ref <Transformer>, !ref <seq_lin>, !ref <ctc_lin>]

This drive me crazy, could you please give me a hand on this? You can find the pretrained model in https://huggingface.co/speechbrain/asr-transformer-aishell

Thanks!

I think they updated the speechbrain and my code works only in older version. In the new version, I think you should use asr_model.hparams.asr_model instead of asr_model.modules.asr_model. asr_model.modules are defined in the yaml file as

modules:
    encoder: !ref <encoder>
    decoder: !ref <decoder>
    lm_model: !ref <lm_model>