THUDM / SwissArmyTransformer

SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.
https://THUDM.github.io/SwissArmyTransformer
Apache License 2.0
951 stars 90 forks source link

请问断点续训应该如何设置 #178

Open elesun2018 opened 4 months ago

elesun2018 commented 4 months ago

image 没有找到优化器保存呢, 断点续训 如何继承 优化器

elesun2018 commented 4 months ago

学习率继承方面:只看到client_lr_scheduler保存,但是没有找到client_lr_scheduler的加载和调用

elesun2018 commented 4 months ago

image 请问optimizer不用继承前期训练的学习器,lr_scheduler是通过args.iteration继承的前期训练的学习率吗

elesun2018 commented 4 months ago

能否指点一下,谢谢

1049451037 commented 4 months ago

实在抱歉,sat目前还不支持optmizer的断点复原。

guoyanhui03 commented 4 months ago

,-------- 原始邮件 --------发件人: elesun2018 @.>日期: 2024年4月12日周五 10:46收件人: THUDM/S .wissArmyTransformer @.>抄送: Subscribed @.***>主 题: Re: [THUDM/SwissArmyTransformer] 请问断点续训应该如何设置 (Issue #178) image.png (view on web) 请问optimizer不用继承前期训练的学习器,lr_scheduler是通过args.iteration继承的前期训练的学习率吗

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>

1049451037 commented 4 months ago

因为optimizer状态一般会占用比较大的磁盘空间,所以我们没有保存optmizer。如果希望通过iteration来继承学习率,需要将--mode finetune换成--mode pretrain