Hi, thanks for your open source, I want to train the model on my own datasets, and could you please tell me the training target and the calculation method of loss in pretraining and finetuing?
FYI, with detailed documentation and tidied codes, LLaMA2-Accessory may be a better choice for you to customize LLM training.
Generally speaking, the training target for LLM, either for pre-training or fine-tuning, and either for single-modal or multi-modal, is next-token prediction. Given a sequence of word tokens, the model predicts each token, given all tokens before that (which is achieved through causal attention)
Therefore, if you want to train on your own dataset, the simplest way is to just prepare your data with the same format as ours, and use these data for training. In most cases, you don't need to change the loss function.
Hi, thanks for your open source, I want to train the model on my own datasets, and could you please tell me the training target and the calculation method of loss in pretraining and finetuing?