apeatling / simple-guide-to-mlx-finetuning

Generate train.jsonl and valid.jsonl files to use for fine-tuning Mistral and other LLMs.
MIT License
76 stars 12 forks source link

Replace Ollama with LiteLLM to allow for any LLM with OpenAI compatibility #2

Open arunsathiya opened 10 months ago

arunsathiya commented 10 months ago

I came across this comment that highlights LiteLLM as an excellent approach to using any LLM with OpenAI format. If you agree, I am happy to submit a PR for it. An example of how it looks off of this feature branch:

def query_litellm(prompt, context=''):
    messages = [{"content": context+prompt, "role": "user"}]
    response = completion(model="gpt-3.5-turbo", messages=messages)
    answer = response['choices'][0]['message']['content'].strip()

    followup_prompt = "What is a likely follow-up question or request? Return just the text of the question or request."
    followup_messages = [{"content": answer, "role": "assistant"}, {"content": followup_prompt, "role": "user"}]
    followup_response = completion(model="gpt-3.5-turbo", messages=followup_messages)
    followup = followup_response['choices'][0]['message']['content'].strip()

    return answer, followup

In my tests, it works pretty well, and allows for faster training data generation by querying cloud-hosted LLMs like GPT 3.5-Turbo, Claude 2, etc. Especially can be a preferred approach on MacBooks with slower performance.

ivanfioravanti commented 9 months ago

The problem with litellm is the hardcoded list of supported ollama model with many constraints on context size: check the json file here