ggerganov / llama.cpp

LLM inference in C/C++
MIT License
62.68k stars 8.99k forks source link

POST /infill on server example returns empty content unless non-empty "prompt" is passed in #4027

Closed jonastemplestein closed 3 months ago

jonastemplestein commented 8 months ago

I'm running the server like this using a quantised code llama

./server -t 10 -ngl 0 -m models/gguf/codellama-7b.Q5_K_M.gguf -c 4096

The following curl always instantly responds with any empty "content" key in the JSON response:

curl --request POST \
    --url http://localhost:8080/infill \
    --header "Content-Type: application/json" \
    --data '{"input_prefix":"def helloworld():\n    print(\"hell","input_suffix":"\n   print(\"goodbye world\")\n","temperature":0.1}'

If I add a non-empty "prompt" key to my request, the generation succeeds. But this is contrary to the documentation that says the prompt key is ignored.

curl --request POST \
    --url http://localhost:8080/infill \
    --header "Content-Type: application/json" \
    --data '{"prompt": "bla", "input_prefix":"def helloworld():\n    print(\"hell","input_suffix":"\n   print(\"goodbye world\")\n","temperature":0.1}'
jonastemplestein commented 8 months ago

I think I've fixed it in #4028 but would appreciate if somebody could double check as I don't really know what I'm doing 🤣

x4080 commented 8 months ago

@jonastemplestein Do you use it for codeshell-vscode ? I tried your PR but it seems not working for deepseek coder - server closed itself when trying completion

sorry if not related

jonastemplestein commented 8 months ago

I don't think that's related. I just fixed the bug where you had to pass in a prompt to the infill endpoint, even though the code then did nothing with it

x4080 commented 8 months ago

@jonastemplestein ok then

SweetPotato95 commented 6 months ago

@x4080 @jonastemplestein Seems it's related. In server.cpp/request_completion(), there's a check

if(task.data.at("prompt").size(), will throw error and cause 500

and codeshell extension won't sent a prompt paramter when using /infill

ggerganov commented 6 months ago

Could you guys check if https://github.com/ggerganov/llama.cpp/pull/4833 fixes the issue?

SweetPotato95 commented 6 months ago

Could you guys check if #4833 fixes the issue?

yeap it is.

github-actions[bot] commented 3 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.