Oneflow-Inc / libai

LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
https://libai.readthedocs.io
Apache License 2.0
391 stars 55 forks source link

fix max to keep #478

Closed CPFLAME closed 1 year ago

CPFLAME commented 1 year ago

修复max_to_keep功能:

在config中加入配置可以使得, 模型每100个iter保存一次, 并且最大保存数量为3. 只保存最新的模型:

train.checkpointer.period = 100
train.checkpointer.max_to_keep = 3