huggingface / distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
MIT License
3.33k stars 238 forks source link

Loss Weight ablation experiment #75

Open JinYu1998 opened 5 months ago

JinYu1998 commented 5 months ago

Have you tried the effect of different loss weights on the distillation results ?

sanchit-gandhi commented 4 months ago

Hey @JinYu1998 - we did a coarse sweep over the KL weights when setting up preliminary experiments on just the LibriSpeech corpus. We found the setting from DistilBART to be best, and so committed to this for the rest of the project. We didn't do any further tuning of the loss weights on our full training set. You can find an ablation over the loss terms (not weights) in page 26 of the paper.

JinYu1998 commented 4 months ago

Hey @JinYu1998 - we did a coarse sweep over the KL weights when setting up preliminary experiments on just the LibriSpeech corpus. We found the setting from DistilBART to be best, and so committed to this for the rest of the project. We didn't do any further tuning of the loss weights on our full training set. You can find an ablation over the loss terms (not weights) in page 26 of the paper.

Thank you for your reply, I have previously worked on dynamic temperature distillation on classifieds, and just recently finished this work. I'm very interested in distillation in whisper, and look forward to combining my work with distill whisper very well.

abdulmominseo commented 4 months ago

Delicious & Exciting Diet Foods : Weight Loss Food full video - https://youtu.be/4Kr8gtd2oss?si=1HVNwuBNTKgr4XCL