Closed yonetaniryo closed 1 year ago
Thanks for your suggestion!
We will support open-source LLMs and AZure in future versions!
Hey @zhaochenyang20 , let's only assign people on the project team who can fix the issue to issues, so I unassigned @yonetaniryo .
@yonetaniryo , thanks for the report! We'll likely fix this as part of the larger issue #288 , but I'll keep this open until we do.
Sure, thank you both for your consideration!
I believe this should now be supported as part of litellm
Yes, I believe so! @yonetaniryo , I'll close this issue but please re-open if it's not working.
Thank you very much for the great work! It's wonderful that now prompt2model is going to support multiple types of LLMs.
Currently, it appears that prompt2model_demo.py is not yet fully compatible with the Azure OpenAI service without modification. This seems to stem from a few factors, including the difference in model names (such as gpt-3.5-turbo
versus gpt-35-turbo
), and the need for additional information (like AZURE_API_BASE
and AZURE_API_VERSION
as mentioned here).
However, this may not pose a problem if the demo is intended to be a singular example that exclusively works for the OpenAI API, and not for other LLMs including the Azure one. So I just wanted to make it clear; with this repo, are users expected to develop their own components such as a parser or generator, such as AzureInstructionParser
instead of OpenAIInstructionParser
?
@yonetaniryo the difference in model names (gpt-3.5-turbo vs. gpt-35-turbo) shouldn't be an issue. What's required is your deployment id/name (e.g. "chatgpt-test" is a deployment name in this example, so the model name to litellm would be azure/chatgpt-test
).
@neubig is the key change allowing user to pass in model name here?
Hi, I actually found some issues with the last commit in testing and am fixing them now.
I believe this should now work through litellm. @yonetaniryo , if you could take a look that'd be awesome!
Sorry I have not replied for a long time. I finally got around to trying an update, but unfortunately Azure still doesn't seem to be working properly.
In the prompt2model_demo.py
, I got the error of openai.error.AuthenticationError: Incorrect API key provided
at File "/***/***/programs/prompt2model/prompt2model/utils/api_tools.py", line 106, in generate_one_completion response = completion( # completion gets the key from os.getenv
.
I confirmed that this is an error when /openai/api_requestor.py
is called in litellm. However, the Azure sample in litellm runs without error, so I suspect that there is something wrong with the usage of litellm.
Sorry for the lack of information! I'll try to look into it more closely when I get some more time, though it's difficult to do so in the immediate future.
Sorry If I'm misunderstanding something.
When we call a prompt parser, we instantiate chat_api, which is APIAgent
with default arguments.
https://github.com/neulab/prompt2model/blob/13a2049be64164c85bf548cbe9143fd0838fbaa6/prompt2model/utils/parse_json_responses.py#L76 https://github.com/neulab/prompt2model/blob/13a2049be64164c85bf548cbe9143fd0838fbaa6/prompt2model/utils/api_tools.py#L257
And APIAgent
by default is given by model_name: str = "gpt-3.5-turbo"
and api_base: str | None = None
, feed them to litellm's completion:
However, as shown in their document, this can work only when we use OpenAI API.
from litellm import completion
import os
## set ENV variables
os.environ["OPENAI_API_KEY"] = "your-api-key"
response = completion(
model="gpt-3.5-turbo",
messages=[{ "content": "Hello, how are you?","role": "user"}]
)
from litellm import completion
import os
## set ENV variables
os.environ["AZURE_API_KEY"] = ""
os.environ["AZURE_API_BASE"] = ""
os.environ["AZURE_API_VERSION"] = ""
# azure call
response = completion(
"azure/<your_deployment_name>",
messages = [{ "content": "Hello, how are you?","role": "user"}]
)
From the documentation of litellm, I wonder if 1) we should specify azure/**
, huggingface/**
, etc to inform which LLM we are going to call, and 2) should provite customized env variables, maybe in the beginning of prompt2model_demo.py
. For example, if I change the default model name to be azure/gpt-35-turbo
and manually specify env variables necessary for Azure, the parser worked and got another error: ERROR:root:This model's maximum context length is 4096 tokens. However, you requested 8399 tokens (2105 in the messages, 6294 in the completion)
you are right. In order to use azure, you need to set the three azure specific variables either in your code or using export AZURE_API_KEY=...
and so on for the other azure specific env variables through your terminal, and specify azure/**
for model name.
os.environ["AZURE_API_KEY"] = "..."
os.environ["AZURE_API_BASE"] = "..."
os.environ["AZURE_API_VERSION"] = '...'
model_name = "azure/GPT-3-5-turbo-chat"
azure_api_agent = APIAgent(model_name=model_name)
In order to overcome the issue with the maximum context length, define the max_tokens param to your model's context length.
azure_api_agent = APIAgent(model_name=model_name, max_tokens=4000) # set to 8000 if that is your model's allowed context length
Thank you very much! It’s reasonable to manually specify env variables and model names for each specific LLM type. Throughout this issue thread I just wanted to make clear if the current demo is intentionally designed for OpenAI API only. If we need modifications for other LLMs including Azure, that’s just ok.
Could I ask one more? In some previous versions, I remember both gpt-35-turbo and gpt-35-turbo-16k were used. But in the current version, default_api_agent appears to be always used. Is there any minimum spec recommendation for LLMs to be called there?
the current demo is intentionally designed for OpenAI API only
The current demo is designed to support arbitrary LLMs, though this will require some configuration of the API Agent (example in this twitter post).
In some previous versions, I remember both gpt-35-turbo and gpt-35-turbo-16k were used. But in the current version, default_api_agent appears to be always used. Is there any minimum spec recommendation for LLMs to be called there?
We recently modified our system so that 4K-token models (e.g. gpt-3.5-turbo) should be sufficient for most user prompts. I think that's the only spec that would potentially cause Prompt2Model to completely crash. You will also likely want to use an LLM that has been instruction fine-tuned. Other than that, different LLMs may provide different generation speeds and accuracies.
Sure. Thank you for your continuous support. Now it’s ok for me to close the issue.
Thank you for the discussion!
Thank you very much for a really cool project!
I just wanted to report that I had to fix several things to make the demo working with the Azure OpenAI Service.
gpt-3.5-turbo
in OpenAI API is actuallygpt-35-turbo
in Azure OpenAI API. CurrentlyOpenAIInstructionParser
does not accept model names as input, and I had to directly modify the followling line:https://github.com/neulab/prompt2model/blob/3b420bff75750bcdc3c4a1a90a5a121d7f61ccc4/prompt2model/utils/openai_tools.py#L41
ChatCompletion.create
: I also found thatmodel=
in OpenAI API is acutallyengine=
in Azure version (see https://github.com/openai/openai-python/issues/569#issuecomment-1674082177). This difference also requires several manual fixes to be applied. For example:https://github.com/neulab/prompt2model/blob/3b420bff75750bcdc3c4a1a90a5a121d7f61ccc4/prompt2model/utils/openai_tools.py#L81
openai.api_version
,openai.api_type
, andopenai.api_base
need to be configured somewhere.Maybe it's more like a problem of if OpenAI could make the original and Azure versions more consistent. But it would be helpful if they could be addressed or mentioned in this repository for Azure OpenAI users.