Alpha-VLLM / LLaMA2-Accessory

An Open-source Toolkit for LLM Development
https://llama2-accessory.readthedocs.io/
Other
2.68k stars 170 forks source link

the number of batches in the FinetuneDataset is decided by the smallest dataset? #93

Open ZhenYangIACAS opened 10 months ago

ZhenYangIACAS commented 10 months ago

I consider multiple meta_type datasets, but the log shows that the total batch number in one epoch is decided by the smallest dataset. As the log shows below, when I add more datasets, the total number of batches in one epoch is still 2208, which is the batch number of the smallest dataset. Why is this ?

log:

[18:41:11.182261] Epoch: [2] [2120/2208] lr: 0.000098 closs: 3.0516 (4.3038) grad_norm: 8.2056 (6.4250) time: 0.8935 data: 0.0003 max mem: 27193 [18:41:19.053091] Epoch: [2] [2130/2208] lr: 0.000098 closs: 2.6504 (4.2980) grad_norm: 8.2056 (6.4250) time: 0.8218 data: 0.0003 max mem: 27193 [18:41:28.146424] Epoch: [2] [2140/2208] lr: 0.000098 closs: 2.9470 (4.2958) grad_norm: 8.2056 (6.4250) time: 0.8480 data: 0.0003 max mem: 27193 [18:41:37.381831] Epoch: [2] [2150/2208] lr: 0.000098 closs: 3.6000 (4.2927)

ChrisLiu6 commented 10 months ago

By design, LLaMA2-Accessory should not behave like this. Please check the log, where the composition of the actually used dataset, as well as the total number of data items in use, are printed. You may search by the key word "total length" in the log file.

ZhenYangIACAS commented 10 months ago

@ChrisLiu6 The total length printed is right, but the actual batch numbers (2208 in the above log) seems not to change with the total length. I am so confused by this.