Open Kiri0824 opened 1 week ago
Could you check the training loss over time to see if the performance is actually improving?
If the training loss isn't improving, it's possible that your model isn't actually fitting the training data very well, since the default configuration for ASR here only trains a small amount of parameters.
To improve the performance, you can try increasing the number of trained parameters by using a larger prediction head, or finetuning the AV-HuBERT encoder with the --upstream_trainable
option in run_downstream.py
Could you check the training loss over time to see if the performance is actually improving?
If the training loss isn't improving, it's possible that your model isn't actually fitting the training data very well, since the default configuration for ASR here only trains a small amount of parameters.
To improve the performance, you can try increasing the number of trained parameters by using a larger prediction head, or finetuning the AV-HuBERT encoder with the
--upstream_trainable
option inrun_downstream.py
here's my result: loss seems normal. but i print pred token when calculating the metrics. all empty. i think its my configuration problem ?
I'm currently training this set of code on a Chinese dataset. The upstream part I'm using is the fusion_feats of avhubert, and the downstream part is av_asr. I've compiled the dictionary based on Chinese. Now I've encountered a problem. When the model was initially initialized, the predicted tokens had values. However, after several training steps, when I was calculating the metrics and outputted the token results of pred, they were all empty.