run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.83k stars 5.28k forks source link

[Bug]: ServiceContext Not Accepting Fine Tune Model #7467

Closed AndromedaPerseus closed 1 year ago

AndromedaPerseus commented 1 year ago

Bug Description

Trying to pass the fine tune 3.5 turbo model into ServiceContext and it throws an error:

ValueError: Unknown model: ft. Please provide a valid OpenAI model name.Known models are: gpt-4, gpt-4-32k, gpt-4-0613, gpt-4-32k-0613, gpt-4-0314, gpt-4-32k-0314, gpt-3.5-turbo, gpt-3.5-turbo-16k, gpt-3.5-turbo-0613, gpt-3.5-turbo-16k-0613, gpt-3.5-turbo-0301, text-davinci-003, text-davinci-002, text-ada-001, text-babbage-001, text-curie-001, ada, babbage, curie, davinci, gpt-35-turbo-16k, gpt-35-turbo ft_context = ServiceContext.from_defaults( llm=OpenAI(model=ft_model_names, temperature=0.3), context_window=2048, )

Version

0.8.12

Steps to Reproduce

Following the steps from this cookbook: https://colab.research.google.com/drive/1NgyCJVyrC2xcZ5lxt2frTU862v6eJHlc?usp=sharing

Relevant Logs/Tracbacks

ValueError                                Traceback (most recent call last)
<ipython-input-59-f4d3f6b9ee98> in <cell line: 15>()
     13 # Option 2: you can also specify the model name manually
     14 ft_model_names = "ft:gpt-3.5-turbo-0613:model-name::9svdEgjk"
---> 15 ft_context = ServiceContext.from_defaults(
     16     llm=OpenAI(model=ft_model_names, temperature=0.3),
     17     context_window=2048,  # limit the context window artifically to test refine process

3 frames
/usr/local/lib/python3.10/dist-packages/llama_index/llms/openai_utils.py in openai_modelname_to_contextsize(modelname)
    191 
    192     if context_size is None:
--> 193         raise ValueError(
    194             f"Unknown model: {modelname}. Please provide a valid OpenAI model name."
    195             "Known models are: " + ", ".join(ALL_AVAILABLE_MODELS.keys())
dosubot[bot] commented 1 year ago

πŸ€–

Hello,

Thank you for reporting this issue. It seems like you're trying to use a fine-tuned model with the ServiceContext in LlamaIndex version 0.8.12.

Based on the error message you're seeing, it appears that the model name you're trying to use ("ft:gpt-3.5-turbo-0613:model-name::9svdEgjk") is not recognized by the openai_modelname_to_contextsize function in openai_utils.py. This function is used to map model names to their respective context sizes, and it only recognizes the model names listed in the error message.

In a similar issue that was previously solved, the solution was to wait for the langchain to update the LLM module to support the new models. You can find more details about this in the issue [Feature Request]: gpt-3.5-turbo-16k and other new models support.

Additionally, it's worth noting that in the ServiceContext class, you should only specify one of llm and llm_predictor when creating a ServiceContext. If you're using a custom LLM (like your fine tune 3.5 turbo model), you should pass it as the llm parameter and leave llm_predictor as None. The llm_predictor will then be automatically created from your llm.

Unfortunately, I couldn't find any changes in the latest version of LlamaIndex (v0.8.13) that could potentially resolve this issue. It's possible that the support for the fine-tuned model you're trying to use has not been added yet.

I hope this helps! If you have any other questions or need further clarification, please let me know.

Best, Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a πŸ‘ if this is helpful and πŸ‘Ž if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

AndromedaPerseus commented 1 year ago

@dosu-bot, that is invalid. @jerryjliu ran the same logic with a fine tuned model in the cookbook that I provided without any issues. Langchain already supports fine tuned model names from openai.

dosubot[bot] commented 1 year ago

πŸ€–

Hello,

The ValueError you're encountering is likely due to the fact that the LlamaIndex version 0.8.12, specifically the AzureOpenAI class, does not support fine-tuned models from OpenAI directly. The AzureOpenAI class requires an engine parameter, which corresponds to the custom name you chose for your deployment when you deployed a model on Azure OpenAI.

However, the OpenAIFinetuneEngine class in LlamaIndex provides a method get_finetuned_model that retrieves the fine-tuned model. This method retrieves the current job and checks if the job has a fine-tuned model id and if the job status is "succeeded". If these conditions are met, it initializes an instance of the OpenAI class with the fine-tuned model id as the model parameter.

Therefore, to resolve this issue, you should provide the name of the deployed azure engine as the engine parameter when initializing the AzureOpenAI class. If you're trying to use a fine-tuned model, you should use the get_finetuned_model method of the OpenAIFinetuneEngine class to retrieve the fine-tuned model and use it in your application.

Additionally, you can pass the fine-tuned model directly into the ServiceContext or specify the model name manually. For example, you can use llm=ft_llm or llm=OpenAI(model=ft_model_name, temperature=0.3) when initializing the ServiceContext.

As for the changes in LlamaIndex version 0.8.13, I'm unable to provide specific details as I don't have that information at the moment. However, it's always a good idea to update to the latest version as it may contain bug fixes or new features that could potentially resolve the issue you're facing.

I hope this helps! If you have any other questions, feel free to ask.

Best, Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a πŸ‘ if this is helpful and πŸ‘Ž if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

logan-markewich commented 1 year ago

@EyeOfHorus396 I'm unable to reproduce this.

This works for me

>>> from llama_index import ServiceContext
>>> from llama_index.llms import OpenAI
>>> ctx = ServiceContext.from_defaults(llm=OpenAI(model="ft:gpt-3.5-turbo-0613:model-name::9svdEgjk"))
>>>

And if I do it with a clearly incorrect name

>>> ctx = ServiceContext.from_defaults(llm=OpenAI(model="clearly fake model name"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/loganm/llama_index_proper/llama_index/llama_index/indices/service_context.py", line 170, in from_defaults
    llm_metadata=llm_predictor.metadata,
                 ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/loganm/llama_index_proper/llama_index/llama_index/llm_predictor/base.py", line 125, in metadata
    return self._llm.metadata
           ^^^^^^^^^^^^^^^^^^
  File "/home/loganm/llama_index_proper/llama_index/llama_index/llms/openai.py", line 104, in metadata
    context_window=openai_modelname_to_contextsize(self._get_model_name()),
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/loganm/llama_index_proper/llama_index/llama_index/llms/openai_utils.py", line 193, in openai_modelname_to_contextsize
    raise ValueError(
ValueError: Unknown model: clearly fake model name. Please provide a valid OpenAI model name.Known models are: gpt-4, gpt-4-32k, gpt-4-0613, gpt-4-32k-0613, gpt-4-0314, gpt-4-32k-0314, gpt-3.5-turbo, gpt-3.5-turbo-16k, gpt-3.5-turbo-0613, gpt-3.5-turbo-16k-0613, gpt-3.5-turbo-0301, text-davinci-003, text-davinci-002, text-ada-001, text-babbage-001, text-curie-001, ada, babbage, curie, davinci, gpt-35-turbo-16k, gpt-35-turbo
>>> 

The code under the hood is here, are you sure you passed in the model name correctly? Maybe try reinstalling llama-index? https://github.com/jerryjliu/llama_index/blob/ecc5a0520cf6c2cb5d872b1fefbe249163ba5717/llama_index/llms/openai_utils.py#L179

logan-markewich commented 1 year ago

According to the error, the code parsed the model name as ft, which doesn't seem possible πŸ€”

AndromedaPerseus commented 1 year ago

@logan-markewich please give this a try and let me know what it returns for you: from llama_index import ServiceContext from llama_index.llms import OpenAI ctx = ServiceContext.from_defaults(llm=OpenAI(model="ft:gpt-3.5-turbo-0613:swift-loan::7svdNdjk"))

logan-markewich commented 1 year ago

ah, I see the issue, one sec

logan-markewich commented 1 year ago

Lucky you, you have the one model name that breaks the string parsing πŸ˜†

Will merge the fix in a sec, you can install from source when it merges: pip install git+https://github.com/jerryjliu/llama_index

or wait for the next pypi release