jiahe7ay / MINI_LLM

This is a repository used by individuals to experiment and reproduce the pre-training process of LLM.
348 stars 53 forks source link

sft阶段不同卡显存占用不同 #26

Open lainxx opened 6 months ago

lainxx commented 6 months ago

大佬你好sft阶段代码设置了max_length=512,truncation=True,为啥还会出现不同卡的显存占用不同,且有时候会out of memory的情况