Added support for running official HF baseline FSDP-QLoRA benchmark

This PR addresses issue #10 by adding support for a FSDP-compatible HF QLoRA baseline to the our benchmarks.

Feature

This will allow users to specify a no_peft_model field in the plugin config bnb.yaml. Specifying this field will bypass the plugin.augmentation function and allow SFTTrainer to manage the PEFT preparation of the model instead.

NOTE:

While the open-source approach to FSDP-compatible QLoRA removes the extraneous dtype casting in prepare_model_for_kbit_training, it only does so when the model is sharded. When on single device, it continues to use prepare_model_for_kbit_training and users will continue to experience a slowdown due to the extraneous casting.

foundation-model-stack / fms-acceleration

Added support for running official HF baseline FSDP-QLoRA benchmark #16

Feature

NOTE: