foundation-model-stack / fms-fsdp

🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash attention v2.
https://pytorch.org/docs/stable/fsdp.html
Apache License 2.0
112 stars 17 forks source link

The default model variant is 7b but it is not supported. #93

Open htang2012 opened 3 weeks ago

htang2012 commented 3 weeks ago

the default model variant is "7b": https://github.com/foundation-model-stack/fms-fsdp/blob/65b0ea670fa375bb0f7f6a285e7229bb96ebdd0f/fms_fsdp/config/training.py#L8

but it is not in the supported white list: https://github.com/foundation-model-stack/fms-fsdp/blob/65b0ea670fa375bb0f7f6a285e7229bb96ebdd0f/fms_fsdp/utils/config_utils.py#L25

nairbv commented 3 weeks ago

rather than fixing the particular issue, we might just want to migrate this to use get_model directly https://github.com/foundation-model-stack/foundation-model-stack/blob/main/fms/models/__init__.py#L210 (and can register any custom configs like 8b and 8b_4k there too)

htang2012 commented 3 weeks ago

By the way, it works by adding: "--model_variant=llama2_7b" in the command line