keldenl / gpt-llama.cpp

A llama.cpp drop-in replacement for OpenAI's GPT endpoints, allowing GPT-powered apps to run off local llama.cpp models instead of OpenAI.
MIT License
594 stars 67 forks source link

SERVER BUSY, REQUEST QUEUED #51

Open CyberRide opened 1 year ago

CyberRide commented 1 year ago

===== CHAT COMPLETION REQUEST =====

===== LLAMA.CPP SPAWNED ===== /root/llama.cpp/main -m /root/llama.cpp/models/7B/ggml-model-q4_0.bin --temp 0.7 --n_predict 4000 --top_p 0.1 --top_k 40 -b 2000 -c 4096 --seed -1 --repeat_penalty 1.1764705882352942 --reverse-prompt user: --reverse-prompt user --reverse-prompt system: --reverse-prompt system --reverse-prompt ## --reverse-prompt

--reverse-prompt ### -i -p ### Instructions

Complete the following chat conversation between the user and the assistant. System messages should be strictly followed as additional instructions.

Inputs

system: You are a helpful assistant. user: How are you? assistant: Hi, how may I help you today? system: You are ChatGPT, a helpful assistant developed by OpenAI.

Response

user: How are you doing today? assistant:

===== REQUEST ===== user: How are you doing today?

===== RESPONSE =====

REQUEST RECEIVED SERVER BUSY, REQUEST QUEUED