我的M3芯片本地运行MiniCPM-Llama3-V-2_5-int4得到了报错Using `bitsandbytes` 8-bit quantization requires Accelerate #215

Open myBigbug opened 1 month ago

myBigbug commented 1 month ago

当前行为 | Current Behavior

我已经在conda环境中安装了所需要的依赖,在本地mac上执行PYTORCH_ENABLE_MPS_FALLBACK=1 python xxx.py命令的时候,得到了报错。

- OS: Macos M3 sonoma14.3
- Python: 3.10
- Transformers:4.40.0
- PyTorch: 2.1.2
- CUDA : None,使用的MPS

lib'If you don't plan on using image functionality from ``, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
Unused kwargs: ['_load_in_4bit', '_load_in_8bit', 'quant_method']. These kwargs are not used in <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>.
Traceback (most recent call last):
  File "/Users/xxx/miniCPM/", line 7, in <module>
    model = AutoModel.from_pretrained('/Users/gaobo60/aiModel/MiniCPM-Llama3-V-2_5-int4', trust_remote_code=True)
  File "/Users/xxx/miniCPM/lib/python3.10/site-packages/transformers/models/auto/", line 558, in from_pretrained
    return model_class.from_pretrained(
  File "/Users/xxxxx/miniCPM/lib/python3.10/site-packages/transformers/", line 3165, in from_pretrained
  File "/Users/xxxx/miniCPM/lib/python3.10/site-packages/transformers/quantizers/", line 62, in validate_environment
    raise ImportError(
ImportError: Using `bitsandbytes` 8-bit quantization requires Accelerate: `pip install accelerate` and the latest version of bitsandbytes: `pip install -i bitsandbytes`
myBigbug commented 1 month ago

我已经使用命令 pip install -i bitsandbytes ,之后仍然报错

iceflame89 commented 1 month ago

由于bitsandbytes (see issse)现在还不支持MPS,int4模型暂时无法在mac运行。

myBigbug commented 1 month ago

annatongtong commented 1 month ago

请问目前支持win10吗?我setup了requirements和bitsandbytes,都提示 raise ImportError( ImportError: Using bitsandbytes 8-bit quantization requires Accelerate: pip install accelerate and the latest version of bitsandbytes: pip install -i bitsandbytes

iceflame89 commented 4 weeks ago

bitsandbytes 目前仅支持cuda设备

annatongtong commented 3 weeks ago


在 2024-06-17 20:37:17,"Hongji Zhu" @.***> 写道:

vizshrc commented 1 week ago


sunny6206 commented 13 hours ago

I'm trying to fine tune "openbmb/MiniCPM-Llama3-V-2_5-int4" with custom dataset. but getting error "[2024-07-16 11:22:47,459] [INFO] [] Setting ds_accelerator to cuda (auto detect) [WARNING] async_io requires the dev libaio .so object and headers but these were not found. [WARNING] async_io: please install the libaio-dev package with apt [WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found. [WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH [WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.1 [WARNING] using untested triton version (2.1.0), only 1.0.0 is known to be compatible /mnt/c/Users/akhil/OneDrive/Desktop/WITBE_INTA/llm/linux-env/lib/python3.10/site-packages/transformers/ FutureWarning: evaluation_strategy is deprecated and will be removed in version 4.46 of 🤗 Transformers. Use eval_strategy instead warnings.warn( [2024-07-16 11:23:00,796] [INFO] [] cdb=None [2024-07-16 11:23:00,796] [INFO] [] Initializing TorchBackend in DeepSpeed with backend nccl Unused kwargs: ['_load_in_4bit', '_load_in_8bit', 'quant_method']. These kwargs are not used in <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>. low_cpu_mem_usage was None, now set to True since model is quantized. Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████| 2/2 [00:35<00:00, 17.78s/it] Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Currently using LoRA for fine-tuning the MiniCPM-V model. Traceback (most recent call last): File "/mnt/c/Users/akhil/OneDrive/Desktop/WITBE_INTA/llm/MiniCPM-V-main/finetune/", line 328, in train() File "/mnt/c/Users/akhil/OneDrive/Desktop/WITBE_INTA/llm/MiniCPM-V-main/finetune/", line 274, in train model.base_model.vpm.requiresgrad(True) File "/mnt/c/Users/akhil/OneDrive/Desktop/WITBE_INTA/llm/linux-env/lib/python3.10/site-packages/torch/nn/modules/", line 2440, in requiresgrad p.requiresgrad(requires_grad) RuntimeError: only Tensors of floating point dtype can require gradients [2024-07-16 11:24:03,027] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 129003) of binary: /mnt/c/Users/akhil/OneDrive/Desktop/WITBE_INTA/llm/linux-env/bin/python3 Traceback (most recent call last): File "/mnt/c/Users/akhil/OneDrive/Desktop/WITBE_INTA/llm/linux-env/bin/torchrun", line 8, in sys.exit(main()) File "/mnt/c/Users/akhil/OneDrive/Desktop/WITBE_INTA/llm/linux-env/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/", line 346, in wrapper return f(*args, **kwargs) File "/mnt/c/Users/akhil/OneDrive/Desktop/WITBE_INTA/llm/linux-env/lib/python3.10/site-packages/torch/distributed/", line 806, in main run(args) File "/mnt/c/Users/akhil/OneDrive/Desktop/WITBE_INTA/llm/linux-env/lib/python3.10/site-packages/torch/distributed/", line 797, in run elastic_launch( File "/mnt/c/Users/akhil/OneDrive/Desktop/WITBE_INTA/llm/linux-env/lib/python3.10/site-packages/torch/distributed/launcher/", line 134, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/c/Users/akhil/OneDrive/Desktop/WITBE_INTA/llm/linux-env/lib/python3.10/site-packages/torch/distributed/launcher/", line 264, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: FAILED


------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2024-07-16_11:24:02 host : AlticeLab. rank : 0 (local_rank: 0) exitcode : 1 (pid: 129003) error_file: traceback : To enable traceback see:" the code is "#!/bin/bash GPUS_PER_NODE=1 NNODES=1 NODE_RANK=0 MASTER_ADDR=localhost MASTER_PORT=6001 MODEL="openbmb/MiniCPM-Llama3-V-2_5-int4" # or openbmb/MiniCPM-V-2 # ATTENTION: specify the path to your training data, which should be a json file consisting of a list of conversations. # See the section for finetuning in README for more information. DATA="/mnt/c/Users/akhil/OneDrive/Desktop/WITBE_INTA/llm/MiniCPM-V-main/finetune/vl_finetune_data.json" EVAL_DATA="/mnt/c/Users/akhil/OneDrive/Desktop/WITBE_INTA/llm/MiniCPM-V-main/finetune/finetune-evaluate_data.json" LLM_TYPE="minicpm" # if use openbmb/MiniCPM-V-2, please set LLM_TYPE=minicpm DISTRIBUTED_ARGS=" --nproc_per_node $GPUS_PER_NODE \ --nnodes $NNODES \ --node_rank $NODE_RANK \ --master_addr $MASTER_ADDR \ --master_port $MASTER_PORT " /mnt/c/Users/akhil/OneDrive/Desktop/WITBE_INTA/llm/linux-env/bin/torchrun $DISTRIBUTED_ARGS \ --model_name_or_path $MODEL \ --llm_type $LLM_TYPE \ --data_path $DATA \ --eval_data_path $EVAL_DATA \ --remove_unused_columns false \ --label_names "labels" \ --prediction_loss_only false \ --bf16 false \ --bf16_full_eval false \ --fp16 true \ --fp16_full_eval true \ --do_train \ --do_eval \ --tune_vision true \ --tune_llm false \ --use_lora true \ --lora_target_modules "llm\..*layers\.\d+\.self_attn\.(q_proj|k_proj)" \ --model_max_length 2048 \ --max_slice_nums 9 \ --max_steps 10000 \ --eval_steps 1000 \ --output_dir output/output_minicpmv2_lora \ --logging_dir output/output_minicpmv2_lora \ --logging_strategy "steps" \ --per_device_train_batch_size 2 \ --per_device_eval_batch_size 1 \ --gradient_accumulation_steps 1 \ --evaluation_strategy "steps" \ --save_strategy "steps" \ --save_steps 1000 \ --save_total_limit 10 \ --learning_rate 1e-6 \ --weight_decay 0.1 \ --adam_beta2 0.95 \ --warmup_ratio 0.01 \ --lr_scheduler_type "cosine" \ --logging_steps 1 \ --gradient_checkpointing true \ --deepspeed ds_config_zero2.json \ --report_to "tensorboard" # wandb "
sunny6206 commented 13 hours ago

Also I have set the correct path for json dataset