-
### System Info
```
pip install git+https://github.com/huggingface/transformers.git
pip install tokenizers==0.20.0
pip install accelerate==0.34.2
pip install git+https://github.com/huggingface/tr…
-
### System Info
Latest TRL from source, can't run TRL env rn as cluster is shut down but I'm installing everything from source.
If required will restart cluster and run.
### Information
- [ ] Th…
-
### Search before asking
- [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussion…
-
### Search before asking
- [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussion…
-
Hi @Kinyugo ,
Great repo, thank you for the work!
I just wanted to clarify something. From my understanding, what you have implemented here is Consistency Distillation (CD), right? The consisten…
-
Hi Chuangguang,
Great work and thanks for sharing your code!
I have a question regarding the student networks in your method. From what I’ve seen, the student networks are all trained from scrat…
-
Hi, thank you for this great work! We used the pre-training code and data provided in the current repository to run pre-training, but the performance on downstream tasks was not as strong as the VoCo_…
-
Is there any data that you can share on how long it took to train the student models with the recommended setup of [4 GPUs with 12GB memory](https://github.com/browsermt/students/blob/master/train-stu…
-
self.vph = torch.load('vph_imagenet.pt')
self.swin = torch.load('swin_imagenet.pt')
-
After training en-hu we noticed a somewhat larger quality gap in 4 BLEU points between the teacher and student models.
It’s 24.8 for the quantized and fine-tuned student vs 30.2 BLEU for the teache…