kenhktsui commented 1 year ago

Description

This PR fixes a bug in Azure OpenAI Services.

Motivation and Context

The existing code always passes "gpt-3.5-turbo" to "deployment_id", which is an arbitrary name used for deployment in Azure and therefore it results in error when locating the resource.

The proposed design is to add an extra key "deployment" for each model. With that, the corresponding deployment w.r.t model name, will be passed to "engine" in OpenAI API.

[{"displayName": "GPT-3.5", "name": "gpt-3.5-turbo", "deployment" :"your-gpt-3.5-deployment"}, {"displayName": "GPT-4", "name": "gpt-4", "deployment": "your-gpt-4-deployment"}]

This shall also close the issue raised in https://github.com/ricklamers/gpt-code-ui/issues/5#issuecomment-1580581178

How has this been tested?

It has been tested locally with satisfactory result.

Types of changes

Fix

dasmy commented 1 year ago

Thank you for looking into this.

I have to confess that I do not understand the change: According to https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/switching-endpoints#keyword-argument-for-model, the deployment_id and engine have identical meaning for Azure.

If I read your code correctly, you expect to see in the model variable the content of the corresponding name entry. This would be as designed: just put content of your deployment field into the name field in the AZURE_OPENAI_DEPLOYMENTS json and it should properly end up in the deployment_id keyword (assuming there is no other bug anywhere).

kenhktsui commented 1 year ago

Thank you for looking into this.

I have to confess that I do not understand the change: According to https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/switching-endpoints#keyword-argument-for-model, the deployment_id and engine have identical meaning for Azure.

If I read your code correctly, you expect to see in the model variable the content of the corresponding name entry. This would be as designed: just put content of your deployment field into the name field in the AZURE_OPENAI_DEPLOYMENTS json and it should properly end up in the deployment_id keyword (assuming there is no other bug anywhere).

Thanks for your response @dasmy ! I am new to this repo, so I am not entirely familiar with the overall design, but the fix work for me. Please see if it makes sense to you.

I tried that yesterday, but I ran into the issue where it did not succeed in locating the deployment. https://github.com/ricklamers/gpt-code-ui/blob/82dbd1dc0803c7682a34fcb6df70082e6f08ef6f/gpt_code_ui/webapp/main.py#L250 When I inspected the request.json, 'model' is always "gpt-3.5-turbo" but not the deployment name I updated in .env. So it is either we could change the frontend (I am not good at it) or we could handle in backend with an additional mapping.

And as Azure has different concepts of the model name and deployment name , so my idea was to translate the model name into deployment name, to make the concept consistent for both OpenAI and Azure OpenAI.

dasmy commented 1 year ago

This is strange. I just tried again using the current master branch without any additional changes and to works: For me, model inside the get_code() function was - depending on the frontend selection - either gpt-35-turbo-0613 or gpt-4-0314. Also, with setting OPENAI_API_LOGLEVEL=debug inside my .env, I can see that the corresponding endpoints are used for requests.

My .env setting for the deployments reads

AZURE_OPENAI_DEPLOYMENTS=[{"displayName": "GPT-3.5", "name": "gpt-35-turbo-0613"}, {"displayName": "GPT-4", "name": "gpt-4-0314"}]

As per your comments, model was always gpt-3.5-turbo in line 250 of webapp.py. This variable is fed into get_code() and then later used by your code. - How would that work if model was not changing?

I am still trying to understand what went wrong in the first place. Can you possibly try again with unchanged current main and e.g., the following setting in your .env?

[{"displayName": "GPT-3.5", "name" :"your-gpt-3.5-deployment"}, {"displayName": "GPT-4", "name": "your-gpt-4-deployment"}, , {"displayName": "Cheesecake", "name": "cheesecake"}]
OPENAI_API_LOGLEVEL=debug

I also added on non-existing model and deployment to see, if this one properly ends up in the frontend list and makes its way through the requests (where it will fail with an error saying that this deployment does not exist).

kenhktsui commented 1 year ago

This is strange. I just tried again using the current master branch without any additional changes and to works: For me, model inside the get_code() function was - depending on the frontend selection - either gpt-35-turbo-0613 or gpt-4-0314. Also, with setting OPENAI_API_LOGLEVEL=debug inside my .env, I can see that the corresponding endpoints are used for requests.

My .env setting for the deployments reads
AZURE_OPENAI_DEPLOYMENTS=[{"displayName": "GPT-3.5", "name": "gpt-35-turbo-0613"}, {"displayName": "GPT-4", "name": "gpt-4-0314"}]
As per your comments, model was always gpt-3.5-turbo in line 250 of webapp.py. This variable is fed into get_code() and then later used by your code. - How would that work if model was not changing?

I am still trying to understand what went wrong in the first place. Can you possibly try again with unchanged current main and e.g., the following setting in your .env?
[{"displayName": "GPT-3.5", "name" :"your-gpt-3.5-deployment"}, {"displayName": "GPT-4", "name": "your-gpt-4-deployment"}, , {"displayName": "Cheesecake", "name": "cheesecake"}]
OPENAI_API_LOGLEVEL=debug
I also added on non-existing model and deployment to see, if this one properly ends up in the frontend list and makes its way through the requests (where it will fail with an error saying that this deployment does not exist).

Thanks for replying. I also feel strange but hopefully I could resolve it. I checked out to main and set the .env as you suggested. I didn't select any model or insert the key in frontend before prompting. The below error throws (I masked some info) afterwards.

message='OpenAI API response' path=https://maskedbaseurl.openai.azure.com/openai/deployments/gpt-3.5-turbo/chat/completions?api-version=2023-03-15-preview processing_ms=4.4474 request_id=None response_code=404
body='{"error":{"code":"DeploymentNotFound", "message":"The API deployment for this resource does not exist. If you created the deployment within the last 5 minutes, please wait a moment and try again."}}' headers="{'Content-Length': '198', 'Content-Type': 'application/json', 'OpenAI-Processing-Ms': '4.4474', 'apim-request-id': 'xxxxxxxxxxxxx', 'Strict-Transport-Security': 'max-age=31536000; includeSubDomains; preload', 'x-content-type-options': 'nosniff', 'x-ms-region': 'East US', 'Date': 'Thu, 03 Aug 2023 09:31:02 GMT'}" message='API response body'
error_code=DeploymentNotFound error_message='The API deployment for this resource does not exist. If you created the deployment within the last 5 minutes, please wait a moment and try again.' error_param=None error_type=None message='OpenAI API error received' stream_error=False

I wonder what your request.json is like after this line? https://github.com/ricklamers/gpt-code-ui/blob/82dbd1dc0803c7682a34fcb6df70082e6f08ef6f/gpt_code_ui/webapp/main.py#L247 For me 'model' is "gpt-3.5-turbo" but not the deployment name I put in the .env

dasmy commented 1 year ago

This is how it looks for me (got-4-32k-0613 is one of the names mentioned in my .env):

Can you check whether entries from your .env appear in the model selection dropdown list in the lower left of the window at all? - For the example above, it should also contain the "Cheesecake" entry.

Also, you could try to verify, if the /models endpoint is called and which content is sent in the AVAILABLE_MODELS constant. Should be consistent with your .env settings:

kenhktsui commented 1 year ago

@dasmy Thanks for your time. I now understand the behaviour after some investigation. I encountered this bug when I did not select the model in the frontend, as such it always uses the frontend cache from the old .env, even I changed the .env. Once I select the model, then it correctly uses the updated value from the new .env.

ricklamers / gpt-code-ui

fix: azure deployment #76

Description

Motivation and Context

How has this been tested?

Types of changes