issues
search
arcee-ai
/
DistillKit
An Open Source Toolkit For LLM Distillation
GNU Affero General Public License v3.0
337
stars
36
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Models with same architecture but different tokenizer
#17
bil-ash
opened
2 days ago
0
AttributeError: 'DataParallel' object has no attribute 'device'
#16
Wolfman1219
opened
1 week ago
0
Support Llama 3.2 1B model ?
#15
JinYu1998
opened
2 weeks ago
0
Is Dense to MoE, MoE to Dense or MoE to MoE Distillation supported?
#14
linux-leo
opened
1 month ago
0
[News] GKD method
#13
kashif
opened
1 month ago
1
How can distillation be carried out under these circumstances?
#12
lean-wang
opened
1 month ago
0
Can it support multi-node training?
#11
zidong-onepiece1
opened
1 month ago
0
Deprecated positional argument(s) used in SFTTrainer
#10
JinYu1998
closed
1 month ago
7
After training, the model output cannot stop
#9
blackblue9
opened
2 months ago
4
Update distil_hidden.py for teacher tokenizer
#8
fernando-neto-ai
closed
2 months ago
0
Teacher uses student tokenizer in distill_hidden.py
#7
HeegonJin
closed
2 months ago
1
Plan for larger model?
#6
YixinSong-e
opened
2 months ago
2
encoder model distillation?
#5
riyajatar37003
closed
2 months ago
1
CUDA Out of memory issue
#4
avemio-digital
opened
2 months ago
8
RuntimeError: 'weight' must be 2-D
#3
Hasan-Syed25
opened
2 months ago
4
Distillation with distil_hidden.py when launched fails with Pytorch Version 2.4 with "_get_socket_with_port"
#2
Nottlespike
closed
2 months ago
5
added some initial logic to load the teacher logits
#1
shamanez
opened
2 months ago
4