OpenGVLab / LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
GNU General Public License v3.0
5.76k stars 375 forks source link

what is the training target and how to calculate the loss in pretraining and finetuning. #105

Open adda1221 opened 1 year ago

adda1221 commented 1 year ago

Hi, thanks for your open source, I want to train the model on my own datasets, and could you please tell me the training target and the calculation method of loss in pretraining and finetuing?

ChrisLiu6 commented 1 year ago
  1. FYI, with detailed documentation and tidied codes, LLaMA2-Accessory may be a better choice for you to customize LLM training.
  2. Generally speaking, the training target for LLM, either for pre-training or fine-tuning, and either for single-modal or multi-modal, is next-token prediction. Given a sequence of word tokens, the model predicts each token, given all tokens before that (which is achieved through causal attention)
  3. Therefore, if you want to train on your own dataset, the simplest way is to just prepare your data with the same format as ours, and use these data for training. In most cases, you don't need to change the loss function.