use of <bot_end> - Githubissues

nexusflowai / NexusRaven-V2

390 stars 32 forks source link

use of <bot_end> #5

Open nxitik opened 11 months ago

nxitik commented 11 months ago

Hi, Excellent work on function calling. However, how can I use to save on inference speed and tokens?

nxitik commented 11 months ago

result = pipeline(prompt, max_new_tokens=2048, stop = "", return_full_text=False, do_sample=False, temperature=0.001)[0]["generated_text"] print (result)

Pipeline is: pipeline = pipeline( "text-generation", model="Nexusflow/NexusRaven-V2-13B", torch_dtype="auto", device_map="auto", )

Error: ValueError: The following model_kwargs are not used by the model: ['stop'] (note: typos in the generate arguments will also show up in this list)

atgehrhardt commented 9 months ago

I am also struggling to get this to work. When setting stop criteria it will always generate the prompt:

llm = Ollama(
    model="nexusraven",
    callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),
    base_url="http://127.0.0.1:11434", 
    stop=["<bot_end>"],
)

Gavingx commented 9 months ago

Same as you，the "" seems to be useless.

atgehrhardt commented 9 months ago

Okay so I found a stupidly easy solution. All function calls are always returned first, so just set the stopping criteria to "Thought:" and it will immediately stop after the function calls:

llm = Ollama( model="nexusraven", callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), base_url="http://127.0.0.1:11434", stop=["Thought:"], )