Open qmpham opened 1 year ago
Hi, to train the model to generate GPT-like responses, we set the target sequence as the GPT response and input/source sequence as the previous dialog history.
But LLAMA has input's max_len of only 2048 tokens
This can be handled by the data loader/tokenizer. For example, we truncate the input on the left side if it exceeds the max length:
yes, I understand. but why interested in having long input while the model's capacity is only 2048. You might risk of truncating the question to which the target is addressed
That's true, the dialog commonly exceeds the maximum sequence length while training. However, we can mitigate this by truncating inputs on the left side, so that the most recent dialog history on the right is preserved:
Line 329 data_loading.py