huggingface / accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
https://huggingface.co/docs/accelerate
Apache License 2.0
7.83k stars 952 forks source link

AttributeError: 'BitsAndBytesConfig' object has no attribute 'load_in_4bit' #1588

Closed vrunm closed 1 year ago

vrunm commented 1 year ago

System Info

- `Accelerate` version: 0.21.0.dev0
- Platform: Linux-5.15.109+-x86_64-with-glibc2.35
- Python version: 3.10.10
- Numpy version: 1.23.5
- PyTorch version (GPU?): 2.0.0 (True)
- PyTorch XPU available: False
- System RAM: 15.63 GB
- GPU type: Tesla P100-PCIE-16GB
- `Accelerate` default config:
    Not found

Information

Tasks

Reproduction

While Fine-tuning GPT-NeoX-20B with QLoRa using accelerate and bitsandbytes I ran into this issue:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

model_name = "EleutherAI/gpt-neox-20b"

#Tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)

quant_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=quant_config)

After trying to load the model ran into this issue:

Unexpected exception formatting exception. Falling back to standard exception
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3508, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "/tmp/ipykernel_28/122249750.py", line 1, in <module>
    model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=quant_config)
  File "/opt/conda/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 467, in from_pretrained
  File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2244, in from_pretrained
    load_in_4bit = quantization_config.load_in_4bit
AttributeError: 'BitsAndBytesConfig' object has no attribute 'load_in_4bit'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 2105, in showtraceback
    stb = self.InteractiveTB.structured_traceback(
  File "/opt/conda/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1396, in structured_traceback
    return FormattedTB.structured_traceback(
  File "/opt/conda/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1287, in structured_traceback
    return VerboseTB.structured_traceback(
  File "/opt/conda/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1140, in structured_traceback
    formatted_exception = self.format_exception_as_a_whole(etype, evalue, etb, number_of_lines_of_context,
  File "/opt/conda/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1055, in format_exception_as_a_whole
    frames.append(self.format_record(record))
  File "/opt/conda/lib/python3.10/site-packages/IPython/core/ultratb.py", line 955, in format_record
    frame_info.lines, Colors, self.has_colors, lvals
  File "/opt/conda/lib/python3.10/site-packages/IPython/core/ultratb.py", line 778, in lines
    return self._sd.lines
  File "/opt/conda/lib/python3.10/site-packages/stack_data/utils.py", line 144, in cached_property_wrapper
    value = obj.__dict__[self.func.__name__] = self.func(obj)
  File "/opt/conda/lib/python3.10/site-packages/stack_data/core.py", line 734, in lines
    pieces = self.included_pieces
  File "/opt/conda/lib/python3.10/site-packages/stack_data/utils.py", line 144, in cached_property_wrapper
    value = obj.__dict__[self.func.__name__] = self.func(obj)
  File "/opt/conda/lib/python3.10/site-packages/stack_data/core.py", line 681, in included_pieces
    pos = scope_pieces.index(self.executing_piece)
  File "/opt/conda/lib/python3.10/site-packages/stack_data/utils.py", line 144, in cached_property_wrapper
    value = obj.__dict__[self.func.__name__] = self.func(obj)
  File "/opt/conda/lib/python3.10/site-packages/stack_data/core.py", line 660, in executing_piece
    return only(
  File "/opt/conda/lib/python3.10/site-packages/executing/executing.py", line 190, in only
    raise NotOneValueFound('Expected one value, found 0')
executing.executing.NotOneValueFound: Expected one value, found 0

Expected behavior

The model weights should have been loaded correctly from huggingface.

sgugger commented 1 year ago

It looks like you need to upgrade your version of Transformers, the 4bit support is only 4.30.0

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.