DevXT-LLC / ezlocalai

ezlocalai is an easy to set up local artificial intelligence server with OpenAI Style Endpoints.
MIT License
72 stars 13 forks source link

How to increase "completion_tokens" limit to go beyond 16? #8

Closed dspdavinci closed 8 months ago

dspdavinci commented 8 months ago

In the examples.ipynb, I ran this cell (with a new DNS name):

import openai

openai.api_base = "https://localllm.dev01.datascience/v1" openai.api_key = "" prompt = "tell me a joke in more than 100 words" messages = [{"role": "system", "content": prompt}]

response = openai.ChatCompletion.create( model="LLaMA2-13B-Estopia",#"Mistral-7B-OpenOrca", messages=messages, temperature=1.31, max_tokens=16384, top_p=1.0, n=1, stream=False, ) print(response)

{ "id": "cmpl-d3e9fe6f-0230-466f-b9f2-da59b30952ed", "object": "text_completion", "created": 1705507498, "model": "LLaMA2-13B-Estopia", "usage": { "prompt_tokens": 57, "completion_tokens": 16, "total_tokens": 73 }, "messages": [ { "role": "user", "content": "tell me a joke in more than 100 words" }, { "role": "assistant", "content": "Once upon a time, there was a little old lady who had lived alone for" } ] }

It seems the response['messages'][1] is incomplete and the completion_tokens is limited to 16. How to increase the "completion_tokens" limit to go beyond 16, and make the response content complete? Thanks!

Josh-XT commented 8 months ago

Apologies, this was due to the max tokens variable going to the wrong place. I just fixed it, should be all set once you update.