Azure Deployment Name Bug

JiaCYu commented 1 month ago

Policy and info

Maintainers will close issues that have been stale for 14 days if they contain relevant answers.
Adding the label "sweep" will automatically turn the issue into a coded pull request. Works best for mechanical tasks. More info/syntax at: https://docs.sweep.dev/

Expected Behavior

There shouldn't be an error with the model name.

Current Behavior

Deployment name seems to mix with model name.

Everything seems to work perfectly and code is being made:

But then an error pops up telling me that the model doesn't exist and it takes my Azure OpenAI deployment name and says it's not a model.

Here is the command style I used following these instructions from here: https://gpt-engineer.readthedocs.io/en/latest/open_models.html

gpt-engineer --azure [redacted_endpoint_url] ./snake_game/ [redacted_deployment_name]

Additional Failure Information

Using Azure OpenAI with gpt-4-turbo deployed with a different deployment name. Only installed gpt-engineer in a virtual environment.

JiaCYu commented 1 month ago

I just reread the instructions and I realized that the Azure Deployment Name should be the model name as well. Which kind of sucks because there are a bunch of projects dependent on that deployment name and a lot refactoring would have to be done.

viborc commented 1 month ago

Hey @JiaCYu, we have our community meeting today, and I'll make sure we discuss this topic with the team and get back to you!

captivus commented 1 month ago

If I understand the issue you're reporting correctly, this is due to the parameters that OpenAI expects for Azure deployments of its model and is not a GPTE-specific issue.

captivus commented 1 month ago

@viborc I have a meeting conflict with today's dev meeting, but I can speak more about this if the issue isn't resolved during the meeting today.

viborc commented 1 month ago

Sounds good @captivus! I might assign you to this one if we don't figure it out during the meeting!

viborc commented 1 month ago

We decided to use your very generous offer to help us with this @captivus and @zigabrencic mentioned that he'll add some comments here, too!

zigabrencic commented 1 month ago

Hey.

As discussed in the meeting today. My proposal would be to to sync the Azur part with same approach we use open models and Open router. For example (from here):

export OPENAI_API_BASE="https://openrouter.ai/api/v1"
export OPENAI_API_KEY="sk-key-from-open-router"
export MODEL_NAME="meta-llama/llama-3-8b-instruct:extended"

gpte <project_dir> $MODEL_NAME --lite --temperature 0.1

Instead of using custom --azure flag.

@captivus what do you think? I don't know their hosting specs so I'm not sure if it's doable.

captivus commented 1 month ago

Looking at this further, I think the bug may be ours and due to the way that we implement the model version in an opinionated way:

https://github.com/gpt-engineer-org/gpt-engineer/blob/66986568a116308743bc624f8ffbabe8b82ce897/gpt_engineer/core/ai.py#L349

Let's discuss in our dev meeting.

captivus commented 1 month ago

More thoughts here. @JiaCYu can you please set the OPENAI_API_VERSION environment variable with your deployment's version and try again? The internals I've linked to above show how we handle environment variable when it is not set, and is consistent with the error messages you've provided. Without knowing more about your specific deployment, it's difficult to advise specifically what this value should be.

This documentation may be helpful. This issue resolution may also prove helpful.

Please try and feedback!

JiaCYu commented 1 month ago

@captivus Sorry for the late reply.

I set OPEN_API_VERSION in the environmental variable in powershell, and I still get the same error as I do from the original post:

The deployment I have is named gpt-4-[redacted, but there are 4 letters here] and the deployment of the actual model is just gpt-4-turbo ver. 0125-preview:

captivus commented 3 weeks ago

Can you please try referencing one of the available deployments in your environment, as shown in the output you provided? The error suggests that you are trying to use a model that is unavailable.

Also, kindly share text outputs in addition to screenshots if additional debugging is needed.

We will upgrade the default version to use the latest supported model in Azure, which is 2024-05-01-preview per the docs.

JiaCYu commented 2 weeks ago

So, the "Unknown model" that the error message contains is my deployment name that I mentioned here:

The deployment I have is named gpt-4-[redacted, but there are 4 letters here]

The thing is, my deployment name is different than the actual model name itself.

Also, kindly share text outputs in addition to screenshots if additional debugging is needed.

Here is the command I use with gpt-engineer:

The first red box is my azure endpoint and the last red box is the last 4 letters of my deployment name on azure.

I think it may be taking my deployment name and entering it as the model name at some point. I don't think there is a way to enter the model name separately.

rohansasmal123 commented 2 weeks ago

Passing azure model deployment name in --model tag solved the issue. Here is the command: gpt-engineer --azure <AZURE_OPENAI_API> --model <DEPLOYMENT_NAME> ./projects/snake

JiaCYu commented 1 week ago

Passing azure model deployment name in --model tag solved the issue. Here is the command: gpt-engineer --azure <AZURE_OPENAI_API> --model <DEPLOYMENT_NAME> ./projects/snake

captivus commented 1 week ago

Passing azure model deployment name in --model tag solved the issue.

Here is the command:

gpt-engineer --azure <AZURE_OPENAI_API> --model <DEPLOYMENT_NAME> ./projects/snake

This is the expected behavior @JiaCYu. I see your response indicating that this hasn't worked for you. Assuming you're escaping characters in the name of your model, as needed from the shell, I'm at a loss as to how to further debug the issue.

captivus commented 1 week ago

Can you please try referencing one of the available deployments in your environment, as shown in the output you provided? The error suggests that you are trying to use a model that is unavailable.

Also, kindly share text outputs in addition to screenshots if additional debugging is needed.

We will upgrade the default version to use the latest supported model in Azure, which is 2024-05-01-preview per the docs.

I've just merged the upgrade to the default version that I suggested we'd incorporate. Does that change anything for you?

gpt-engineer-org / gpt-engineer