Open Blosslzy opened 4 months ago
I have the same question. In the paper, the training is divided into two steps, but in the code implementation, it seems to complete all the steps in one go, rather than finishing the first step, freezing the teacher model, and then using it to train the student model. If it's convenient for you, could you please provide an explanation?
同样有这个疑惑,而且loss和原论文也有差异。我跑出来结果比较差,ff++、frame-level:acc91、auc96
Thank you for sharing your work!!! I'm eagerly awaiting the release of the supplementary materials, as the process of freezing the real teacher isn't thoroughly explained in the main body. Additionally, it appears that all the real teacher is engaged in the training within the code. I look forward to hearing from you soon.