huggingface / distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
MIT License
3.32k stars 238 forks source link

[init] choose student model initialization layers #122

Closed eustlb closed 1 month ago

eustlb commented 2 months ago

Add the possibility to choose specific layers from the teacher model to initialize the decoder of the student model.