Open nxitik opened 11 months ago
result = pipeline(prompt, max_new_tokens=2048, stop = "
Pipeline is: pipeline = pipeline( "text-generation", model="Nexusflow/NexusRaven-V2-13B", torch_dtype="auto", device_map="auto", )
Error:
ValueError: The following model_kwargs
are not used by the model: ['stop'] (note: typos in the generate arguments will also show up in this list)
I am also struggling to get this to work. When setting stop criteria it will always generate the prompt:
llm = Ollama(
model="nexusraven",
callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),
base_url="http://127.0.0.1:11434",
stop=["<bot_end>"],
)
Same as you,the "
Okay so I found a stupidly easy solution. All function calls are always returned first, so just set the stopping criteria to "Thought:" and it will immediately stop after the function calls:
llm = Ollama( model="nexusraven", callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), base_url="http://127.0.0.1:11434", stop=["Thought:"], )
Hi, Excellent work on function calling. However, how can I use to save on inference speed and tokens?