-
### Prerequisite
- [X] I have searched [the existing and past issues](https://github.com/open-mmlab/mmyolo/issues) but cannot get the expected help.
- [X] I have read the [FAQ documentation](https…
-
Hello! I am training the first two knowledge distillation stages of Mamba 2 on one DGX-H100x8 node, and I am experiencing train times of ~8 hours for the first stage, and ~13 hours for the second stag…
-
### Before Asking
- [X] I have read the [README](https://github.com/meituan/YOLOv6/blob/main/README.md) carefully. 我已经仔细阅读了README上的操作指引。
- [X] I want to train my custom dataset, and I have read the …
-
# Reproducer
```
import distily
distily.run.benchmark(
teacher_model_name_or_path="gpt2",
output_dir="distily_verify_compile",
hub_model_id="distily/distily_verify_compile",
…
lapp0 updated
4 weeks ago
-
Thanks for your great work and I really appreciate that the code is released! While I have some questions when I try to carry out the experiments with distillation:
1. How is the process actually c…
-
Hello ,
Can i use jetson nano kit for training models in your repository with my custom dataset?
would the specification of jetson nano kit fine to train models design according to distillatio…
-
### Describe the bug
Setting the threshold using ManualThreshold seems to have no effect on the output of predict.
I set the `default_value` in `ManualThreshold` to 0.99 and 0.0001, but the images…
-
Thank you for sharing this great repo.
Can you please provide instructions, or code if available, for task distillation on the SQuAD dataset?
Thanks in advance
-
I have run the code 'python3 run_distillation.py --dataset cifar10 --save_path path/to/directory/ --samples_per_class 10 --platt --learn_labels ' to generate a distilled set on cifar10. But when i wan…
-
I noticed in Table 12 of your paper, the hyperparameter ($\beta$) is set to a very low value, 1e-8, which suggests that the proposed code-based distillation process plays an almost negligible role dur…