Closed chehx closed 3 months ago
I read report 1.1 and it does not state that only 80G of memory is required for training. Where did you see that?
I read report 1.1 and it does not state that only 80G of memory is required for training. Where did you see that?
https://github.com/hpcaitech/Open-Sora/issues/344#issuecomment-2102359347
Honestly, I saw this response. Did I misunderstand something?
This issue is stale because it has been open for 7 days with no activity.
I found the problem!
When I wanna use the pre-trained weight via huggingface, it will load the config file:
https://huggingface.co/hpcai-tech/OpenSora-STDiT-v2-stage3/blob/main/config.json
where, the
"enable_flash_attn": false, "enable_layernorm_kernel": false,
is forbidden!
This issue is stale because it has been open for 7 days with no activity.
I am gonna close this issue since it appears to have been resolved by the question owner.
Many thanks for open-sourcing this great project.
Currently, I meet the out-of-memory error when training.
I use the default training config in stage3.py and I have 2 A100 80G.
However, it raises the error, but in report 1.1, it says the default config is for 80G memory usage.
Currently, when I use 480p with 48 frames, it takes around 73GB.