foundation-model-stack / fms-fsdp

🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash attention v2.
https://pytorch.org/docs/stable/fsdp.html
Apache License 2.0
114 stars 18 forks source link

more flexible selective ac #57

Closed lchu-ibm closed 3 months ago

lchu-ibm commented 3 months ago

To address https://github.com/foundation-model-stack/fms-fsdp/issues/56

lchu-ibm commented 3 months ago

cc @lessw2020 who implemented the original selective ac.

lchu-ibm commented 3 months ago

@nairbv I modified the way of doing selectivity.

Please have another round of review.

lchu-ibm commented 3 months ago

@nairbv ready for another round of review.