THUDM / ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
Other
15.74k stars 1.85k forks source link

[BUG/Help] 如何在 ptuning/main.py 中精确指定 EPOCH #537

Open wizardforcel opened 1 year ago

wizardforcel commented 1 year ago

Is there an existing issue for this?

Current Behavior

[INFO|trainer.py:567] 2023-09-11 16:25:22,233 >> max_steps is given, it will override any value given in num_train_epochs
[INFO|trainer.py:1712] 2023-09-11 16:25:22,360 >> ***** Running training *****
[INFO|trainer.py:1713] 2023-09-11 16:25:22,360 >>   Num examples = 731
[INFO|trainer.py:1714] 2023-09-11 16:25:22,360 >>   Num Epochs = 9
[INFO|trainer.py:1715] 2023-09-11 16:25:22,360 >>   Instantaneous batch size per device = 1
[INFO|trainer.py:1718] 2023-09-11 16:25:22,360 >>   Total train batch size (w. parallel, distributed & accumulation) = 16
[INFO|trainer.py:1719] 2023-09-11 16:25:22,360 >>   Gradient Accumulation steps = 16
[INFO|trainer.py:1720] 2023-09-11 16:25:22,360 >>   Total optimization steps = 400
[INFO|trainer.py:1721] 2023-09-11 16:25:22,361 >>   Number of trainable parameters = 1,835,008

Expected Behavior

我的数据集大小是 731 条,批量大小是 1,然后 max_step 设置为 400,结果发现 n_epoch 为 9,这个是怎么计算的?

如果我想直接调整 Epoch 数量,应该怎么办?

Steps To Reproduce

.

Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

Anything else?

No response