use error - Githubissues

sherdencooper / GPTFuzz

Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts

MIT License

403 stars 50 forks source link

use error #27

Open zky001 opened 8 months ago

zky001 commented 8 months ago

The model's max seq len (4096) is larger than the maximum number of tokens that can be stored in KV cache (1792). Try increasing gpu_memory_utilization or decreasing max_model_len when initializing the engine

sherdencooper commented 8 months ago

Hi, thanks for running our codes, it looks like you are encountering an issue with vllm. You could refer to https://github.com/vllm-project/vllm/issues/2418 to try the solution mentioned there. Since the vllm running may depend on cuda version and torch version, I cannot determine the solution for your case. If you still encounter issues with vllm, you may turn to hugging face inference instead.