Doesn't LongChat is a fine-tune from Llama Model but with a tweak script to rotation embedding layer? Or you have to do pretraining first before fine-tuning with sharegpt data? Because the script for pretraining exists and makes me assume we have to pretrained Llama Model first before fine-tuning.
@fahadh4ilyas Thanks for the question! The pre-training script was not used for LongChat. It is for people who wish to do some pre-training experiments.
Doesn't LongChat is a fine-tune from Llama Model but with a tweak script to rotation embedding layer? Or you have to do pretraining first before fine-tuning with sharegpt data? Because the script for pretraining exists and makes me assume we have to pretrained Llama Model first before fine-tuning.