Open rezacute opened 3 months ago
Have you tried lowering the max model length?
yes i have set max_model_len=4096 but still error. but if i ran it trough vll command and set max model len , it works
vllm serve meta-llama/Meta-Llama-3.1-8B-Instruct --max_model_len 4096
@rezacute It seems langchain doesn't pass through all args automatically. Try passing them in the vllm_kwargs
parameter as a dictionary https://github.com/langchain-ai/langchain/blob/16bd0697dce35f7a1672231959307c1efb05876b/libs/community/langchain_community/llms/vllm.py#L72-L73
Your current environment
🐛 Describe the bug
Unable to infer llama-3.1 using langchain_community vllm
error: