Open wertyac opened 9 months ago
we run Qwen-14b-chat-int4 on qwen.cpp. And ask the same question of the CUDA version. Howerver, qwen.cpp return the wrong answer. But the CUDA version is OK. So with the qwen.cpp the LLM is declined.
we run Qwen-14b-chat-int4 on qwen.cpp. And ask the same question of the CUDA version. Howerver, qwen.cpp return the wrong answer. But the CUDA version is OK. So with the qwen.cpp the LLM is declined.