Closed netixc closed 5 months ago
Here is the working plugin:
import json
import requests
from pluginInterface import LLMPluginInterface
class LocalLLM(LLMPluginInterface):
context_length = 4096
base_url = "http://localhost:1234/v1"
def predict(self, message, history, system_prompt):
# Construct the payload for the API request
messages = [{"role": "system", "content": system_prompt}]
for user, ai in history:
messages.append({"role": "user", "content": user})
messages.append({"role": "assistant", "content": ai})
messages.append({"role": "user", "content": message})
# Construct prompt based on the system's requirements
prompt = " ".join([msg['content'] for msg in messages])
# Prepare the API request payload
payload = {
"prompt": prompt,
"stream": True,
"temperature": 0.95
}
# Debugging output to trace HTTP request details
print("Sending request to:", self.base_url + "/completions")
print("Payload:", json.dumps(payload))
# Send the POST request to the server
response = requests.post(
f"{self.base_url}/completions", json=payload, stream=True
)
# Check for errors in the response status code
if response.status_code != 200:
yield "Error in generating response."
return
# Initialize a buffer to accumulate the full response progressively
accumulated_text = ""
# Process the streaming response line-by-line
for line in response.iter_lines(decode_unicode=True):
# Process only lines that start with "data:"
if line.startswith("data:"):
data_str = line[5:].strip()
# Stop processing if [DONE] is received
if data_str == "[DONE]":
break
# Try to decode each chunk of JSON data
try:
data = json.loads(data_str)
# Extract and accumulate text from each choice
for choice in data.get("choices", []):
accumulated_text += choice["text"]
yield accumulated_text
except json.JSONDecodeError as e:
yield "Error in generating response."
also if you have a gguf model you can just copy the local LLM plugin and change the model path.
1st of all thank you the code as i tried it with lm studio this is the output i get. i used dolphin-2.8-mistral-7b-v02-Q8_0.gguf with lmstudio and chatml as template, any idea how i can make it so the output is only the response ?
and lastly can u give me ur paypal so i can buy u a cofee
I could not reproduce this issue. maybe try a fresh install of lm studio. https://www.paypal.me/xiaoheiOvO Thanks for the support <3
i tried another model and other template and it worked. Ty , is there much to change to run it on a macbook aswell ?
Haven't tested on Mac, but shouldn't need to change code, just need to set up the environment yourself.
hello ive been trying to make this work with localhost:1234/v1 im not realy a coder so i use gpt4 to help me but thats also not enough gpt4 made this for me but i dont get the response back but in lm studio log it shows the response