-
We've been struggling too much with setting up running 6 models from Whisperer Challenge. Over the weekend, I researched some existing HSI dataset, probably with HSI dataset. Here's some [chat log](ht…
-
Hi there,
thank you for your nice work and code.
I am trying to run the `train_cvpr2023.py --config scripts/DFKD_Cifar10_ResNet34_ResNet18_V37_conv012.yaml` example on an A100 GPU. With tqdm, t…
-
I read this paper [Structural Knowledge Distillation for Object Detection](https://arxiv.org/abs/2211.13133v1) and implemented KD with yolov8 but the result was very bad. I think that the problem was …
-
How can I use different language models from Hugging Face for knowledge distillation in this set up?
-
hello ,distillation of knowledge is mentioned In the paper , but i didn't see in the code
-
TBD
-
thanks your shariing. Can you tell me which lossfunction is best in " "kd_losses " for classification task?
-
-
I find it truly fascinating! Have you come across any methods similar to pruning, distillation, or quantization that could be applied to this model? While I'm aware of some size options, it would be t…
-
## 一言で言うと
知識蒸留において,効果的な教師となる条件について研究を行う.効果的な教師の条件として,一貫して(consisitent), 忍耐強い(patient)な教師が重要であることを見つけた.また知識蒸留においては,エポック数を増やして時間を長めにとるといいことを見つけた.
### 論文リンク
https://openaccess.thecvf.com/content/C…
mei28 updated
2 years ago