issues
search
songmzhang
/
DSKD
Repo for Paper "Dual-Space Knowledge Distillation for Large Language Models".
25
stars
3
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Is the Tinyllama in the description a base model or pretrained model?
#16
survivebycoding
opened
4 days ago
1
Reproduction of results
#15
mathamateur
opened
1 week ago
1
using mistral from
#14
survivebycoding
closed
4 days ago
6
load 72B teacher model
#13
ypw-lbj
opened
1 week ago
3
Evaluation script error with TinyLlama
#12
srikhetramohanty
opened
2 weeks ago
2
qwen
#11
zjjznw123
opened
2 weeks ago
3
Concern regarding performance
#10
survivebycoding
closed
1 week ago
15
Running inference using evaluation scripts
#9
srikhetramohanty
closed
2 weeks ago
2
Getting an error when trying to perform SFT on Tiny Llama
#8
survivebycoding
closed
3 weeks ago
10
More desctipion on output folder created
#7
survivebycoding
closed
4 weeks ago
1
Can we use this code for CPU?
#6
survivebycoding
closed
1 month ago
1
need LLama .bin file instead of .pth file
#5
survivebycoding
closed
1 month ago
2
From where should we downloads the models?
#4
survivebycoding
closed
1 month ago
4
Usage with other model combinations
#3
botox-100
closed
1 month ago
4
About SeqKD with different vocabularies
#2
2018cx
closed
2 months ago
3
关于 AKL 的计算
#1
wutaiqiang
closed
2 months ago
1