ssbuild / chatglm_finetuning

chatglm 6b finetuning and alpaca finetuning
1.54k stars 176 forks source link

缺省Lora训练显存消耗 60G #253

Open is opened 1 year ago

is commented 1 year ago

最新的dev分支6a42db4c1fdffee9ccc8f7d91775c5b4112738f6

使用缺省的配置,lora,没有开quantization, 没有开deepspeed

# 模块配置, 默认启用lora
enable_deepspeed = False
enable_ptv2 = False
enable_lora = True
enable_int8 = False # qlora int8
enable_int4 = False # qlora int4
INFO: 
  | Name                                  | Type      | Params
--------------------------------------------------------------------
0 | _TransformerLightningModule__backbone | LoraModel | 6.2 B 
--------------------------------------------------------------------
3.7 M     Trainable params
6.2 B     Non-trainable params
6.2 B     Total params
24,704.811Total estimated model params size (MB)

直接跑train.py, 显存消耗50-60G, 在V100S单卡上OOM,这个情况合理吗?