increase accumulated_cache_size_limit to 128 to make 70b compile-able

foundation-model-stack / fms-fsdp

🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash attention v2.

https://pytorch.org/docs/stable/fsdp.html

Apache License 2.0

114 stars 18 forks source link

increase accumulated_cache_size_limit to 128 to make 70b compile-able #51

Closed lchu-ibm closed 3 months ago

lchu-ibm commented 3 months ago

Stack from ghstack (oldest at bottom):

-> #51
50
49
48
47

foundation-model-stack / fms-fsdp

increase accumulated_cache_size_limit to 128 to make 70b compile-able #51

50

49

48

47