-
### Search before asking
- [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussion…
-
### Metadata: Knowledge Distillation Meets Self-Supervision
- Author: Guodong Xu, Ziwei Liu, Xiaoxiao Li, Chen Change Loy
- Organization: The Chinese University of Hong Kong & Nanyang Technological …
-
### My Issue:
1. No matter what value I set for the **--save_steps** parameter, the system always saves the checkpoint after exactly 500 steps.
2. No matter what value I set for the **--save_total_l…
-
TBD
-
Hello, @545999961.
I was fine-tuning bge-m3, and found a bug when not using `knowledge_distilation` parameter.
This was my training script:
```
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --np…
-
Currently `rf.BatchNorm` decides whether to update the running statistics based on the `rf.get_run_ctx().train_flag` as in [this line](https://github.com/rwth-i6/returnn/blob/master/returnn/frontend/n…
-
```
RTX 2080 Ti
python 3.7.7 hcff3b4d_5
cuda100 1.0 0 pytorch
pytorch 0.4.1 py37_py…
-
Hi, I have a problem about: what is the valu of T or self.tau do you have choiced? or How can i set the value while training my project?
T = self.tau
# taken from https://git…
-
In the paper, authors claim that Distillation with unlabeled examples improves fine-tuned models in two ways, as shown in Figure 6:
(1) when the student model has a smaller architecture than the tea…
-
Here are some topic suggestions for the presentations.
Please comment the topic you want to work on!
## Tools and Frameworks
- GitHub Copilot
- Langchain
- Grammarly 👉🏼 @Leamayf
- Scikit …