ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
11.02k stars 1.18k forks source link

You are calling `save_pretrained` to a 4-bit converted model, but your `bitsandbytes` version doesn't support it. #3951

Open shripadk opened 4 months ago

shripadk commented 4 months ago

Describe the bug

I have enabled 4-bit quantization for fine tuning mistralai/Mistral-7B-v0.1. Seems like Ludwig 0.10.1 depends on bitsandbytes < 0.41.0. But when I run the trainer I get the following warning:

You are calling `save_pretrained` to a 4-bit converted model, but your `bitsandbytes` version doesn't support it. 
If you want to save 4-bit models, make sure to have `bitsandbytes>=0.41.3` installed.

To Reproduce Steps to reproduce the behavior:

  1. Install Ludwig
pip install ludwig[full]
  1. Config file (model.yaml):
model_type: llm
base_model: mistralai/Mistral-7B-v0.1

quantization:
  bits: 4

adapter:
  type: lora

prompt:
  template: |
    ### Instruction:
    {instruction}

    ### Input:
    {input}

    ### Response:

input_features:
  - name: prompt
    type: text

output_features:
  - name: output
    type: text

generation:
  temperature: 0.1

trainer:
  type: finetune
  epochs: 3
  optimizer:
    type: paged_adam
  batch_size: 1
  eval_steps: 100
  learning_rate: 0.0002
  eval_batch_size: 2
  steps_per_checkpoint: 1000
  learning_rate_scheduler:
    decay: cosine
    warmup_fraction: 0.03
  gradient_accumulation_steps: 16
  enable_gradient_checkpointing: true

preprocessing:
  sample_ratio: 0.1
  1. Train the model:
ludwig train --config model.yaml --dataset "ludwig://alpaca"

Expected behavior Should not show the warning on bitsandbytes version not supporting save_pretrained for 4-bit quantization.

Environment (please complete the following information):

@alexsherstinsky

yogeshhk commented 4 months ago

Here is the notebook showing the run... First run asked for a RESTART, after doing that and running all the cells, the output is https://colab.research.google.com/drive/1kmZhQKBzpHBJRJvvp9PEdPEUMfMu6dh7?usp=sharing Just FYI.... btw, the output of the model is "","", but that's most likely an issue with the base model!! [ [@shripadk @alexsherstinsky]

yogeshhk commented 4 months ago

With more epochs Gemma finetuning seems to work fine https://console.cloud.google.com/vertex-ai/colab/notebooks?project=document-ai-374204&activeNb=projects%2Fdocument-ai-374204%2Flocations%2Fus-central1%2Frepositories%2F87000216-df46-4358-8bb1-6bc933f4c82b [@shripadk @alexsherstinsky ]

alexsherstinsky commented 11 hours ago

@shripadk Are you still having the issues? A new version of Ludwig will be release next week (you may wish to try again). Please keep an eye on the release announcement next week in our Discord. Thank you!

shripadk commented 9 hours ago

@alexsherstinsky thanks for the heads up. I'll definitely take a look at it and get back to you on this. Will surely keep an eye on the release. Thanks again 🎉