songmzhang / DSKD

Repo for the EMNLP'24 Paper "Dual-Space Knowledge Distillation for Large Language Models".
36 stars 4 forks source link

GPT2-1.5B Pretrained Teacher on Dolly #17

Closed cpsu00 closed 2 months ago

cpsu00 commented 2 months ago

Hello, I'm not sure where is the GPT2-1.5B pretrained teacher on dolly is located in the link in readme. Can you guide me where to find it?

songmzhang commented 2 months ago

For this model, we directly use the checkpoint released by Gu et al. in MiniLLM. The link in readme points to the repo of MiniLLM. image You can download the gpt2 models via the provided commands. Then untar the gpt2.tar and you will find it in gpt2/train/sft/gpt2-xlarge.

cpsu00 commented 2 months ago

It works! Thanks for your fast reply.