In your minillm paper Table 1, LLaMa-7B SFT w/o KD R-L results below:
DollyEval
SelfInst
VicunaEval
S-NI
UnNI
Reported in paper
26.3
20.8
17.5
32.4
35.8
Reproduced
25.4
16.9
18.4
28.6
31.0
But I hard to reproduce this student. So, does any key components of sft on LLaMa I have missed? I just use scripts of scripts/llama/sft/sft_7B.sh on my 1 node 8*32G V100, after using your data process scripts to get Dolly full dataset.
But I hard to reproduce this student. So, does any key components of sft on LLaMa I have missed? I just use scripts of
scripts/llama/sft/sft_7B.sh
on my 1 node 8*32G V100, after using your data process scripts to get Dolly full dataset.