ggerganov / llama.cpp

LLM inference in C/C++
MIT License
68.63k stars 9.86k forks source link

Server 'penalize_nl' parameter defaults to False? #7136

Closed AayushG159 closed 5 months ago

AayushG159 commented 6 months ago

I am unsure of why such a thing occurs. I'm passing in the json.gbnf grammar to restrict the output. I have also set the following parameters -

Here is an example prompt I pass - <|user|>You are JARVIS, a capable and knowledgable AI assistant. You only respond in JSON.<|end|>\n<|user|>Name top 5 things to eat in Chicago. Also explain why.<|end|><|assistant|><s>

And here is the response I receive - {\n "top_5_food": [\n {\n "name": "Chicago-style Deep Dish Pizza",\n "reason": "This iconic dish is a must-try for its unique crust and generous toppings, originating from the city itself."\n },\n {\n "name": "Hot Dog on a Poppy Bun",\n "reason": "A Chicago classic featuring a poppy seed bun with all traditional toppings like mustard, relish, pickle, onions and sport peppers."\n },\n {\n "name": "Cotton Candy",\n "reason": "An indulgent treat that\'s perfect for kids or anyone looking for a sweet, fluffy cloud of sugar."\n },\n {\n "name": "Chicago-style Ribeye Sandwich",\n "reason": "A juicy steak sandwich with crispy fries and coleslaw on the side. It\'s known for its generous portions and rich flavors."\n },\n {\n "name": "Garrett Popcorn",\n "reason": "This unique popcorn is a Chicago specialty, made with caramelized sugar and butter, giving it a sweet and salty taste that\'s hard to resist."\n }\n ]\n}\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n

Is this because of how n_predict works internally or is it model related?

AayushG159 commented 6 months ago

Okay. I seem to have figured out the issue. Looks like penalize_nl was set to False. The thing is I do not set it to False and it takes in the default value. But the docs mention that the default value is True. For testing I just ran it again -

Server - .\llama.cpp\build\bin\Release\server --model E:\LLMs\Models\Phi-3-Mini\Phi-3-mini-4k-instruct-q4.gguf --ctx-size 4098 --n-gpu-layers 25 --threads 4 --port 6000 -fa --chat-template phi3

CURL request - curl --request POST --url http://localhost:6000/completion --header "Content-Type: application/json" --data '{"prompt": "<|user|>You are JARVIS, a capable and knowledgable AI assistant. You only respond in JSON.<|end|>\n<|user|>Name top 5 things to eat in Chicago. Also explain why.<|end|>","n_predict": 512, "stop": ["<|end|>"], "typical_p": 0.9, "temperature": 0.0, "seed": 123}'

Response - image

As you can see the penalize_nl takes the default value of False.

Jeximo commented 6 months ago

Yes, readme is outdated. penalize_nl defaults to false.

github-actions[bot] commented 5 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.