INFO:
| Name | Type | Params
--------------------------------------------------------------------
0 | _TransformerLightningModule__backbone | LoraModel | 6.2 B
--------------------------------------------------------------------
3.7 M Trainable params
6.2 B Non-trainable params
6.2 B Total params
24,704.811Total estimated model params size (MB)
最新的
dev
分支6a42db4c1fdffee9ccc8f7d91775c5b4112738f6
使用缺省的配置,lora,没有开quantization, 没有开deepspeed
直接跑train.py, 显存消耗50-60G, 在V100S单卡上OOM,这个情况合理吗?