Alpha-VLLM / LLaMA2-Accessory

An Open-source Toolkit for LLM Development
https://llama2-accessory.readthedocs.io/
Other
2.63k stars 168 forks source link

Do you use only first conversation in the data? #117

Closed yeonju7kim closed 7 months ago

yeonju7kim commented 7 months ago

https://github.com/Alpha-VLLM/LLaMA2-Accessory/blob/32f9a9ebbaccfa5a1cda95bc3172f6129a8680f7/accessory/data/alpaca.py#L223C1-L224C65

In this code, it looks like the model only uses the first conversation. I found the same thing in LLaMA-adapter repository.

https://github.com/OpenGVLab/LLaMA-Adapter/issues/123

Is there any reason why the model only uses the first conversation?

gaopengpjlab commented 7 months ago

SPHINX use multi-turn conversation.

linziyi96 commented 7 months ago

We use this dataset for fine-tuning: https://github.com/Alpha-VLLM/LLaMA2-Accessory/blob/32f9a9ebbaccfa5a1cda95bc3172f6129a8680f7/accessory/data/conversation/dataset.py