RicherMans / Datadriven-GPVAD

The codebase for Data-driven general-purpose voice activity detection.
MIT License
93 stars 23 forks source link

About how to perform fine-tunning #7

Open AjianIronSide opened 3 years ago

AjianIronSide commented 3 years ago

Hi,

Do you have any idea about fine-tunning the pretrained model(such sre) to a more complicated scenario using a small related data set? I tried to use the teacher model to label the new data set, and train few epochs with a very small learning rate. Howerver, the performance drops drastically. Quit sad.

RicherMans commented 3 years ago

Hmm sorry but I didn't quite do this type of research. In theory I think that the method should work, but the general problem that I believe is: the trained students such as sre are likely to be overfitted on their own data i.e., on some type of English speech. Thus I would in theory still recommend to use the method described in the paper (use the teacher to estimate the speech labels).

Btw what do you mean by performance drops? Drops against the student/teacher?

AjianIronSide commented 3 years ago

Yes, the fine-tunning models against the student/teacher model you provided. Your model is so good at rejecting noise. If speech is with complicated background noise, it is very likely to be rejected.

The hurdle is, I can not train the teacher model myself, because I do not have the 527-label-type data. Do you have any idea on training the tearcher model ?

RicherMans commented 3 years ago

Well, just use mine I guess. They are all in the code, for example: teacher 1 and teacher 2.

In forward.py, just pass t1 or t2 as seen here.

AjianIronSide commented 3 years ago

Yeah, I tried. Sadly, not good after tunning

RicherMans commented 3 years ago

Seems weird to me to be honest. At least I did experiments on even chinese after training using teacher t2 and got good results with that, usually still outperforms the teacher in any way.

Also, the loss during my training usually does not decrease by large. Generally I start at ~0.61 and final loss is around ~0.5

saumyaborwankar commented 3 years ago

Seems weird to me to be honest. At least I did experiments on even chinese after training using teacher t2 and got good results with that, usually still outperforms the teacher in any way.

Also, the loss during my training usually does not decrease by large. Generally I start at ~0.61 and final loss is around ~0.5

Hi sir could you share how to do this

RicherMans commented 3 years ago

Just as described in the Readme. First estimate soft labels from a teacher and then train the new student.