Closed shhn1 closed 1 month ago
Thank you for your interest in our research.
Firstly, since we share the same experimental settings as the MiniLLM paper, we also obtained the download link for the processed_data.tar
file from the corresponding GitHub repository. Upon checking, it seems that the link has changed, and you can now download it using the following command:
wget -O processed_data.tar https://unilm.blob.core.windows.net/minillm/MiniLLM/processed_data.tar
Thank you for bringing this to our attention, and we will make sure to update our code accordingly.
Regarding the data format, please refer to the files in the processed_data.tar
file. Each line consists of keys: instruction, input, output, and prompt. The prompt is structured as follows: “{instruction}, {input (if applicable)}, Response:”.
If you have any further questions, feel free to ask anytime.
Thanks for your kind reply! It helps me a lot. :)
Thanks for your great work!
I am very interested in PromptKD you proposed and tried to reproduce it. But I found that the
processed_data.tar
link is invalid and I can't download it. I would be grateful if you could re-upload yourprocessed_data.tar
, which would be very helpful to me.In addition, if I want to replace it with my own training data, how should I organize my data format to meet the subsequent training requirements?
Looking forward to your reply :)