Closed TehNomad closed 1 year ago
Hey @TehNomad,
Thanks for giving me a heads-up about this problem. Could you provide some additional info?
The more details, the better :) Thanks!
@zatevakhin
I know this has been closed since there weren't any other details but I'm seeing the same thing today. I'm using mixtral 8x7b, and it was every prompt tested. I wasn't using llama-cpp server but instead i'm using oobabooga's text-generation-webui and their openai compatible api. It starts decoding and working fine as the response streams in but the response is built, but then fails at last token being generated for some reason.
If I disable "typewriter" mode so that it doesn't use the streaming api then it works completely fine.
I've captured the entire api call made when it fails:
{"prompt":"\n\n### Instructions: Let's write a story about a cat named Buzz \"Light-year\" Aldrin\nCome up with 5 story topics\n\n### Response:\n","stop":["###"],"max_tokens":256,"temperature":0.1,"repeat_penalty":1.1,"top_p":0.95,"top_k":40,"stream":true}
and the response from the server (as it was streamed, captured via wireshark), attached as a file because it's gigantic.
data_stream.txt
I was able to install the llama-cpp-python server with pip and local LLM plugin from BRAT. I keep getting the following error when I try to use LLM Instruction:
Error: SyntaxError: Unexpected non-whitespace character after JSON at position 230
In the Obsidian console, this is the error message:
Error: SyntaxError: Unexpected non-whitespace character after JSON at position 230 at JSON.parse (<anonymous>) at U (plugin:obsidian-local-llm:31:554)
I tried a couple of different prompts and three different models and got the same error. In the terminal window where the llama_cpp server is running, the text generation seems to finish because the timings are posted.