Closed AndromedaPerseus closed 1 year ago
π€
Hello,
Thank you for reporting this issue. It seems like you're trying to use a fine-tuned model with the ServiceContext
in LlamaIndex version 0.8.12.
Based on the error message you're seeing, it appears that the model name you're trying to use ("ft:gpt-3.5-turbo-0613:model-name::9svdEgjk") is not recognized by the openai_modelname_to_contextsize
function in openai_utils.py
. This function is used to map model names to their respective context sizes, and it only recognizes the model names listed in the error message.
In a similar issue that was previously solved, the solution was to wait for the langchain
to update the LLM module to support the new models. You can find more details about this in the issue [Feature Request]: gpt-3.5-turbo-16k and other new models support.
Additionally, it's worth noting that in the ServiceContext
class, you should only specify one of llm
and llm_predictor
when creating a ServiceContext
. If you're using a custom LLM (like your fine tune 3.5 turbo model), you should pass it as the llm
parameter and leave llm_predictor
as None. The llm_predictor
will then be automatically created from your llm
.
Unfortunately, I couldn't find any changes in the latest version of LlamaIndex (v0.8.13) that could potentially resolve this issue. It's possible that the support for the fine-tuned model you're trying to use has not been added yet.
I hope this helps! If you have any other questions or need further clarification, please let me know.
Best, Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a π if this is helpful and π if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
@dosu-bot, that is invalid. @jerryjliu ran the same logic with a fine tuned model in the cookbook that I provided without any issues. Langchain already supports fine tuned model names from openai.
π€
Hello,
The ValueError you're encountering is likely due to the fact that the LlamaIndex version 0.8.12, specifically the AzureOpenAI class, does not support fine-tuned models from OpenAI directly. The AzureOpenAI class requires an engine
parameter, which corresponds to the custom name you chose for your deployment when you deployed a model on Azure OpenAI.
However, the OpenAIFinetuneEngine class in LlamaIndex provides a method get_finetuned_model
that retrieves the fine-tuned model. This method retrieves the current job and checks if the job has a fine-tuned model id and if the job status is "succeeded". If these conditions are met, it initializes an instance of the OpenAI class with the fine-tuned model id as the model
parameter.
Therefore, to resolve this issue, you should provide the name of the deployed azure engine as the engine
parameter when initializing the AzureOpenAI class. If you're trying to use a fine-tuned model, you should use the get_finetuned_model
method of the OpenAIFinetuneEngine class to retrieve the fine-tuned model and use it in your application.
Additionally, you can pass the fine-tuned model directly into the ServiceContext or specify the model name manually. For example, you can use llm=ft_llm
or llm=OpenAI(model=ft_model_name, temperature=0.3)
when initializing the ServiceContext.
As for the changes in LlamaIndex version 0.8.13, I'm unable to provide specific details as I don't have that information at the moment. However, it's always a good idea to update to the latest version as it may contain bug fixes or new features that could potentially resolve the issue you're facing.
I hope this helps! If you have any other questions, feel free to ask.
Best, Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a π if this is helpful and π if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
@EyeOfHorus396 I'm unable to reproduce this.
This works for me
>>> from llama_index import ServiceContext
>>> from llama_index.llms import OpenAI
>>> ctx = ServiceContext.from_defaults(llm=OpenAI(model="ft:gpt-3.5-turbo-0613:model-name::9svdEgjk"))
>>>
And if I do it with a clearly incorrect name
>>> ctx = ServiceContext.from_defaults(llm=OpenAI(model="clearly fake model name"))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/loganm/llama_index_proper/llama_index/llama_index/indices/service_context.py", line 170, in from_defaults
llm_metadata=llm_predictor.metadata,
^^^^^^^^^^^^^^^^^^^^^^
File "/home/loganm/llama_index_proper/llama_index/llama_index/llm_predictor/base.py", line 125, in metadata
return self._llm.metadata
^^^^^^^^^^^^^^^^^^
File "/home/loganm/llama_index_proper/llama_index/llama_index/llms/openai.py", line 104, in metadata
context_window=openai_modelname_to_contextsize(self._get_model_name()),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/loganm/llama_index_proper/llama_index/llama_index/llms/openai_utils.py", line 193, in openai_modelname_to_contextsize
raise ValueError(
ValueError: Unknown model: clearly fake model name. Please provide a valid OpenAI model name.Known models are: gpt-4, gpt-4-32k, gpt-4-0613, gpt-4-32k-0613, gpt-4-0314, gpt-4-32k-0314, gpt-3.5-turbo, gpt-3.5-turbo-16k, gpt-3.5-turbo-0613, gpt-3.5-turbo-16k-0613, gpt-3.5-turbo-0301, text-davinci-003, text-davinci-002, text-ada-001, text-babbage-001, text-curie-001, ada, babbage, curie, davinci, gpt-35-turbo-16k, gpt-35-turbo
>>>
The code under the hood is here, are you sure you passed in the model name correctly? Maybe try reinstalling llama-index? https://github.com/jerryjliu/llama_index/blob/ecc5a0520cf6c2cb5d872b1fefbe249163ba5717/llama_index/llms/openai_utils.py#L179
According to the error, the code parsed the model name as ft
, which doesn't seem possible π€
@logan-markewich please give this a try and let me know what it returns for you:
from llama_index import ServiceContext from llama_index.llms import OpenAI ctx = ServiceContext.from_defaults(llm=OpenAI(model="ft:gpt-3.5-turbo-0613:swift-loan::7svdNdjk"))
ah, I see the issue, one sec
Lucky you, you have the one model name that breaks the string parsing π
Will merge the fix in a sec, you can install from source when it merges:
pip install git+https://github.com/jerryjliu/llama_index
or wait for the next pypi release
Bug Description
Trying to pass the fine tune 3.5 turbo model into ServiceContext and it throws an error:
ValueError: Unknown model: ft. Please provide a valid OpenAI model name.Known models are: gpt-4, gpt-4-32k, gpt-4-0613, gpt-4-32k-0613, gpt-4-0314, gpt-4-32k-0314, gpt-3.5-turbo, gpt-3.5-turbo-16k, gpt-3.5-turbo-0613, gpt-3.5-turbo-16k-0613, gpt-3.5-turbo-0301, text-davinci-003, text-davinci-002, text-ada-001, text-babbage-001, text-curie-001, ada, babbage, curie, davinci, gpt-35-turbo-16k, gpt-35-turbo
ft_context = ServiceContext.from_defaults( llm=OpenAI(model=ft_model_names, temperature=0.3), context_window=2048, )
Version
0.8.12
Steps to Reproduce
Following the steps from this cookbook: https://colab.research.google.com/drive/1NgyCJVyrC2xcZ5lxt2frTU862v6eJHlc?usp=sharing
Relevant Logs/Tracbacks