philschmid / easyllm

https://philschmid.github.io/easyllm/
MIT License
431 stars 34 forks source link

Chat completion format for empty system content. #6

Closed viniciusarruda closed 1 year ago

viniciusarruda commented 1 year ago

Hi,

I was looking the meta implementation for the chat completion, and it seems that there is no way to get an empty system. However, with easyllm is possible to use it with empty system input. Is it expected to work correctly? I mean, LLaMA v2 was trained with empty system?

philschmid commented 1 year ago

It's not completely clear how the training data looked. But you can definitely leave out the system prompt, especially for single-turn instructions. For chat completion though it is probably more helpful to include a system prompt. But you can definitely try multiple ways.

viniciusarruda commented 1 year ago

The stuff I tested, even using a badly formatted chat completion structure, LLaMA v2 could produce good results (when compared to the correct format). However, I'm looking for the ideal output, i.e., using the correct format used to train. If you know something about using chat completion without the system prompt, please let me know. I didn't found anything about it.

philschmid commented 1 year ago

If you fine-tuning dataset is not having a system prompt for your instructions then you should omit it.