-
Hello, I noticed there are two training scripts in the repository: train_CSAKD.py and train_CSAKD_offline_teacher_student.py.
Could you please clarify the differences between these two scripts?
I'm…
vq12 updated
2 months ago
-
Hello, I went to test.py to generate a visualization of the anomaly localization after the training, following the closed way that others have asked you before, and I got the following error.What is t…
-
## Information
The problem arises in chapter:
* [ ] Making Transformers Efficient in Production
## Describe the bug
while training i am getting proper F1 score of 0.755940
![image](https:…
-
For example, the teacher model is faster rcnn and the student model is yolo v3.Where can I find out what modules the models have? When I write a random module, I get a key error.
-
Hi Yoon,
As mentioned in the [Sequence-Level Knowledge Distillation](https://arxiv.org/pdf/1606.07947.pdf), implementation of the distillation model is released in this repo, but I didn't find the …
-
![image](https://github.com/YoojLee/paper_review/assets/52986798/4133f5cb-d108-472c-86a5-2db4f4983933)
## Summary
CLIP과 같은 open vocabulary image classification model (VLMs)으로부터 two stage detector에…
-
Could you help to add the paper the list?
Paper (Oral): Boosting 3D Object Detection by Simulating Multimodality on Point Clouds
Paper Link: https://arxiv.org/abs/2206.14971
Thanks!
-
### Description & Motivation
_No response_
### Pitch
A example for knowledge distillation. Especially for the load the teacher model's weight ,and train the student model.
Now I have a trained t…
-
hi,
can I use "knowledge distillation" and "dimension reduction" for Bert-large?
and if it is possible, for knowledge distillation how many layers should be remained in option2 ?
and for dimension …
-
Can you provide the details of the model is fine-tuned for 1000 epochs with DeiT-style knowledge distillation? Thanks!