Closed avishreekh closed 4 years ago
This is a great paper on the NLP side of things. You can later try something along these lines for transformer-based networks.
You can add your name to CONTRIBUTORS.rst
. :smile:
This is a great paper on the NLP side of things. You can later try something along these lines for transformer-based networks.
I actually found an ICLR paper for transformers. Will definitely think of that after this.
I actually found an ICLR paper for transformers. Will definitely think of that after this.
Yup I have read that paper I guess, is it TinyBERT?
Sequence-Level Knowledge Distillation, Yoon Kim, Alexander M. Rush, 2016 https://arxiv.org/pdf/1606.07947.pdf