foundation-model-stack / fms-fsdp

🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash attention v2.
https://pytorch.org/docs/stable/fsdp.html
Apache License 2.0
116 stars 18 forks source link

revert old low_cpu_mode implementation #18

Closed lchu-ibm closed 4 months ago

lchu-ibm commented 4 months ago

Related issue:

https://github.com/foundation-model-stack/fms-fsdp/issues/6 and https://github.com/foundation-model-stack/fms-fsdp/issues/15

Both can be technically solved in a different (potentially better) way but neither is trivial. So we make a balanced decision at this stage to resolve both for now with a slightly-less-optimal solution.

Corresponding issues should remain open for future revisit.