GraphPKU / PiSSA

PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models(NeurIPS 2024 Spotlight)
https://arxiv.org/abs/2404.02948
261 stars 9 forks source link

Ask for help: I encountered the following error while running `gsm8k_inference.py`: #12

Closed cyp-jlu-ai closed 5 months ago

cyp-jlu-ai commented 5 months ago

Traceback (most recent call last): File "/home/changyupeng/PiSSA/inference/gsm8k_inference.py", line 136, in gsm8k_test(model=args.model, data_path=args.data_file, start=args.start, end=args.end, batch_size=args.batch_size, tensor_parallel_size=args.tensor_parallel_size) File "/home/changyupeng/PiSSA/inference/gsm8k_inference.py", line 93, in gsm8k_test llm = LLM(model=model,tensor_parallel_size=tensor_parallel_size) File "/home/changyupeng/miniconda3/envs/pisa/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 109, in init self.llm_engine = LLMEngine.from_engine_args(engine_args) File "/home/changyupeng/miniconda3/envs/pisa/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 386, in from_engine_args engine_configs = engine_args.create_engine_configs() File "/home/changyupeng/miniconda3/envs/pisa/lib/python3.10/site-packages/vllm/engine/arg_utils.py", line 287, in create_engine_configs model_config = ModelConfig( File "/home/changyupeng/miniconda3/envs/pisa/lib/python3.10/site-packages/vllm/config.py", line 118, in init self._verify_quantization() File "/home/changyupeng/miniconda3/envs/pisa/lib/python3.10/site-packages/vllm/config.py", line 184, in _verify_quantization raise ValueError( ValueError: Unknown quantization method: bitsandbytes. Must be one of ['awq', 'gptq', 'squeezellm', 'marlin'].

Expect the program to run successfully and output prediction results

fxmeng commented 5 months ago

vllm currently does not support bitsandbytes. You can use convert_nf4_model_to_bf16.py to convert the residual model to 16-bit, and then use merge_adapter_to_base_model.py to merge it with the PiSSA module.

cyp-jlu-ai commented 5 months ago

vllm currently does not support bitsandbytes. You can use convert_nf4_model_to_bf16.py to convert the residual model to 16-bit, and then use merge_adapter_to_base_model.py to merge it with the PiSSA module.

@fxmeng Thank you for your response.

I understand that vllm does not currently support bitsandbytes. I would like to further confirm the specific steps as follows:

Model Conversion:

Additionally, if there are any other matters or prerequisites to be aware of, please let me know. Thank you very much for your help!