foundation-model-stack / fms-acceleration

🚀 Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.
Apache License 2.0
0 stars 4 forks source link

Allow BNB Plugin to be Loaded Without PEFT Wrapping #10

Closed fabianlim closed 1 month ago

fabianlim commented 1 month ago

This issue regards to the warnings that for QLoRA PeFT we should pass peft_config directly to SFTTrainer

configs/bnb.yaml

add a new flag no_peft_model


# PEFT-related acceleration
peft:

  # quantization-releated acceleration
  # e.g., kernels for quantized base weights
  quantization: 

    # For loading BitsAndBytes quantized layers
    # to serve as 4bit base-weights for LoRA PEFT-tuning.
    # NOTE: currently AutoGPTQ is not properly integrated into huggingface /
    # bitsandbytes, thus recommended quant_type to be either "nf4"
    # or "fp4".
    # bitsandbytes:
    bitsandbytes:
      quant_type: nf4 

      # If True, then no get_peft_model and prepare_model_for_kbit_training
      # will be called. 
      no_peft_model: False
fabianlim commented 1 month ago

@achew010 note that we put in a comment in the main README

Huggingface BNB QLoRA numbers taken with legacy approaches, but we are aware of https://github.com/foundation-model-stack/fms-acceleration/issues/10 and will update our benches. The above includes numbers using fusedOps-and-kernels and actual impl coming soon, see below.

Now that this issue is closed, that comment should be removed.