Open RyunosukeYuu opened 6 months ago
I belive that the model already preprocess input data to 16000 Hz here : https://github.com/microsoft/unilm/blob/8ee6f747da65448b125363eabbe64630bb9c4a25/beats/BEATs.py#L127
I belive that the model already preprocess input data to 16000 Hz here :我相信该模型已经将输入数据预处理为 16000 Hz:
https://github.com/microsoft/unilm/blob/8ee6f747da65448b125363eabbe64630bb9c4a25/beats/BEATs.py#L127 Haha, I have solved this problem. The problem is that the loss cannot be reduced when the following features are input into the pre-trained model for training. This problem has been bothering me for a long time.
Do I have to use an audio sequence with a sampling rate of 16k to use BEATs? Because I found that when I further input the extracted features into resnet18 for the next classification task, I found that the loss could not be reduced.