-
Use captured bicycle video recordings to distill our ViT and SAM models used for inference.
-
thanks so much for open sourcing your work!
when i train the student from the datafree_kd.py with the parameters in README.md the network is training untill epoch 138 when the if in fast_meta.py li…
-
Hi , I have a question regarding the loss used in the outerloop as part of your paper CAFE. It is referenced as
**loss_real = criterion(output_real, lab_real_gather) ** in your code. in distill.py. …
-
in TinyBERT/task_distill.py line 973:
``` python
elif output_mode == "regression":
loss_mse = MSELoss()
cls_loss = loss_mse(student_logits.view(-1), label_ids.view(-1))
```
so TinyBERT i…
-
Recent update, between yesterday and today broke gguf speed on increased CFG values.
CFG=1, regular speed as expected.
CFG=1.5, for example, generation times 10x slower.
NF4 works fine. Restarted…
-
Hi,
I have already finished the annotation step and get the annotations.tsv file, now I am doing the distill. step. Here is my command line "DRAM-v.py distill -i annotations.tsv -o distilled2". H…
-
Whenever i try to run the translation part, the following error comes up on the console `Error loading pipeline: SyntaxError: Unexpected token '
-
This is exacerbated by the fact that the error messages by CMake point at the wrong variables, so it can be quite frustrating to figure out. I wrote down my understanding of this in https://github.com…
-
I've been experimenting with the distil whisper models as an alternative to the standard whisper models. While I was able to successfully integrate the distil models, I'm experiencing some issues with…
-