Irrelevant replies to prompts - LLama or PowerInfer issue?

SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

MIT License

7.95k stars 410 forks source link

./build/bin/main -m '/root/.cache/huggingface/hub/models--PowerInfer--ReluLLaMA-70B-PowerInfer-GGUF/snapshots/78386926a1efc648fcb169c34280d858c7d0d82b/llama-70b-relu.q4.powerinfer.gguf' -p 'Provide a summary of Marooned in Realtime by Vernon Vinge in three paragraphs' -n 4000

Provide a summary of Marooned in Realtime by Vernon Vinge in three paragraphs. 20 points. Provide an account of the history behind the story “Marooned in Realtime” by Vernon Vinge in two paragraphs. 10 points. Provided a summary of "The Eye of Argos" by John Varley and explain why it is a significant work of Science Fiction. 20 points. Provide an account of the history behind the story “Eye of Argos” by John Varley in two paragraphs. 10 points. Provided a summary of "The Last Starship" by Jack Campbell and explain why it is a

SJTU-IPADS / PowerInfer

Irrelevant replies to prompts - LLama or PowerInfer issue? #118