artidoro / qlora

QLoRA: Efficient Finetuning of Quantized LLMs
https://arxiv.org/abs/2305.14314
MIT License
9.96k stars 820 forks source link

Does finetuning need to follow the Llama 2 system prompt format? #229

Open Zheng392 opened 1 year ago

Zheng392 commented 1 year ago

The finetuning Llama 2 example uses the oasst1 dataset with the "### Human: ... ### Assistant: " system prompt format. However, Llama 2 uses the following prompt format:

<s>[INST] <<SYS>>
{{ system_prompt }}
<</SYS>>

{{ user_message }} [/INST]

Would it be a problem if we use a different system prompt format to fine-tune like the oasst1 case?

And what is the difference between using the Llama-2-7b-hf and Llama-2-7b-chat-hf to fine-tune?

marclove commented 1 year ago

The base model hasn't been fine-tuned at all, so you can use whatever format you want.

However, keep in mind that if you do fine-tune the base model on a different chat format, the library you use to generate completions needs to follow whichever format you used. This could be a problem, for instance, if you use HF Transformers conversational pipeline. The conversational pipeline looks for the _build_conversation_input_ids method on the model's tokenizer. If it finds it, it uses that to format and tokenize the dialogue. The llama tokenizer formats conversations in the above format. So if you use a different format in your fine tune and then use HF's llama tokenizer and conversational pipeline, you'll have problems.