Closed yananchen1989 closed 2 weeks ago
Hi @yananchen1989
The KeyError you're encountering suggests that there's a mismatch between the model layers expected by the script and the actual model layers loaded from the checkpoint. This can happen due to several reasons, such as differences in model architecture or issues with the quantization process.
Here are a few steps you can take to troubleshoot and resolve this issue:
Verify Model Architecture: Ensure that the model architecture defined in your script matches the architecture of the Phi-3.5-mini-instruct model. Check the model's documentation or source code for any discrepancies.
Check Quantization Settings: Since you're using bitsandbytes quantization, make sure that the quantization settings are correctly applied and compatible with the model3. You might need to adjust the quantization parameters or try a different quantization method.
Update Dependencies: Ensure that all your dependencies, including PyTorch, transformers, and any other libraries, are up to date. Sometimes, compatibility issues can arise from outdated packages.
Load Model Weights Manually: If the automatic loading process is causing issues, you can try loading the model weights manually. This involves specifying the exact layers and weights to load from the checkpoint.
Consult GitHub Issues: Check if others have encountered similar issues and if there are any solutions or workarounds suggested in the GitHub repository or related forums. https://github.com/vllm-project/vllm/issues
thanks. i guess it is the quantization="bitsandbytes", load_format="bitsandbytes"
causes the error.
hi,
vllm 0.6.3.post1
here is the testing script:
when
llm_name
ismicrosoft/Phi-3.5-mini-instruct
, ormicrosoft/Phi-3-mini-128k-instruct
or other models under the same series, inference causes error[rank0]: KeyError: 'layers.21.mlp.gate_up_proj.weight'