Closed pckennethma closed 1 year ago
Thanks for your interest in LMFLow! That's a very important question and non-trivial at all. The value of number of epochs varies from dataset to dataset. One way is simply trying: you may increase the number of epochs from small to large, e.g. 0.01, 0.1, 1, 10, and narrow the scope according to the output performance.
If you don't want this group behavior, you may pass the option --disable_group_texts True
. Notice that long samples will still be cut into small pieces, so that the transformer model can accept the input.
thanks for your prompt reply. may I ask what is the num of epoch used to finetune llama 7B with alpaca for the released ckpt?
We use 3 epochs for both instruction tuning and medical dataset finetuning.
thanks for your response!
Hello,
Probably a trivial question: The fine-tuning does not take a
batch_size
. It looks like input datasets are somehow grouped. Is there any best practice to decide a proper epoch num for finetuning in LMFlow? (e.g., how to compute the num of epoch for passing the entire dataset)?