arcee-ai DistillKit issues - Githubissues

arcee-ai / DistillKit

An Open Source Toolkit For LLM Distillation

GNU Affero General Public License v3.0

337 stars 36 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Models with same architecture but different tokenizer

#17 bil-ash opened 2 days ago
0
AttributeError: 'DataParallel' object has no attribute 'device'

#16 Wolfman1219 opened 1 week ago
0
Support Llama 3.2 1B model ?

#15 JinYu1998 opened 2 weeks ago
0
Is Dense to MoE, MoE to Dense or MoE to MoE Distillation supported?

#14 linux-leo opened 1 month ago
0
[News] GKD method

#13 kashif opened 1 month ago
1
How can distillation be carried out under these circumstances?

#12 lean-wang opened 1 month ago
0
Can it support multi-node training?

#11 zidong-onepiece1 opened 1 month ago
0
Deprecated positional argument(s) used in SFTTrainer

#10 JinYu1998 closed 1 month ago
7
After training, the model output cannot stop

#9 blackblue9 opened 2 months ago
4
Update distil_hidden.py for teacher tokenizer

#8 fernando-neto-ai closed 2 months ago
0
Teacher uses student tokenizer in distill_hidden.py

#7 HeegonJin closed 2 months ago
1
Plan for larger model?

#6 YixinSong-e opened 2 months ago
2
encoder model distillation?

#5 riyajatar37003 closed 2 months ago
1
CUDA Out of memory issue

#4 avemio-digital opened 2 months ago
8
RuntimeError: 'weight' must be 2-D

#3 Hasan-Syed25 opened 2 months ago
4
Distillation with distil_hidden.py when launched fails with Pytorch Version 2.4 with "_get_socket_with_port"

#2 Nottlespike closed 2 months ago
5
added some initial logic to load the teacher logits

#1 shamanez opened 2 months ago
4