issues
search
foundation-model-stack
/
fms-fsdp
🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash attention v2.
https://pytorch.org/docs/stable/fsdp.html
Apache License 2.0
114
stars
18
forks
source link
increase accumulated_cache_size_limit to 128 to make 70b compile-able
#51
Closed
lchu-ibm
closed
3 months ago
lchu-ibm
commented
3 months ago
Stack from
ghstack
(oldest at bottom):
->
#51
50
49
48
47
Stack from ghstack (oldest at bottom):
50
49
48
47