kardolus / chatgpt-cli

ChatGPT CLI is a versatile tool for interacting with ChatGPT models via OpenAI and Azure, as well as with models from Perplexity AI. It offers streaming, query mode, and history tracking for seamless, context-aware interactions. With extensive configuration options, it’s designed for both users and developers to create a tailored GPT experience.
MIT License
503 stars 34 forks source link

nice work, is there any tuto on how to using self-hosted LLM model with same openai API? #6

Closed lucasjinreal closed 1 year ago

lucasjinreal commented 1 year ago

nice work, is there any tuto on how to using self-hosted LLM model with same openai API?

kardolus commented 1 year ago

I haven't played with self-hosted LLM models. I would assume they would come with their own API. In the future I want to support non-chatgpt APIs as well; I think I will start with Azure.

Let me know if you find more info on self hosted models.

lucasjinreal commented 1 year ago

@kardolus Hi, thank u for your interest on this. More and more people are hosting their own LLMs but keep same openai api like server so that other thirdparty client (both cli or frontend) can be reused.

I think fastchat provide such one, but they implementation are a little bit complicated, you might need a decent GPU for host.

However, I can help test since I have already a server which provide LLM inference and same openai API, I can get stream response in this with exactly same openai package:

parser = argparse.ArgumentParser()
parser.add_argument("--ip", type=str, default=get_ip())
args = parser.parse_args()

local_ip = args.ip

openai.api_key = "EMPTY" # Not support yet
openai.api_base = f"http://{local_ip}:80/v1"
# openai.api_base = f"https://{local_ip}:443/v1"
# openai.verify_ssl_certs = False

print(f'api base: {openai.api_base}')

model = "billa"

stream_mode = True

if stream_mode:
    # create a chat completion
    for chunk in openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[{
            "role": "user",
            "content": "你好,请介绍一下你自己,以及北京的有哪些小吃。"
        }],
        stream=True,
    ):
        content = chunk["choices"][0].get("delta", {}).get("content")
        if content is not None:
            print(content, end='', flush=True)
else:
  # create a chat completion
  completion = openai.ChatCompletion.create(
    model=model,
    messages=[{"role": "user", "content": "你好,请介绍一下你自己,以及北京的有哪些小吃。"}],
  )
  # print the completion
  print(completion.choices[0].message.content)

Can u teach how could I using your cli lib to make a interactive client? (IMO, the minimal modification would just change the url in your package)

kardolus commented 1 year ago

You don't have to modify the code to change the URL and completions + models path. You can either edit/create the config.yaml in ~/.chatgpt-cli or use environment variables to set them (OPENAI_URL, OPENAI_COMPLETIONS_PATH and OPENAI_MODELS_PATH).

However, you may need to alter the code in order to handle payloads returned by your local LLM model. In that case you would have to setup/change the current types (types package).

Let me know how it goes, hope you get it up and running!