ChatGPTNextWeb / ChatGPT-Next-Web

A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). äž€é”źæ‹„æœ‰äœ è‡Șć·±çš„è·šćčłć° ChatGPT/Gemini ćș”甚。
https://app.nextchat.dev/
MIT License
76.58k stars 59.15k forks source link

[Bug] Chat Repsonse cutoff with Azure Configured #5474

Open itsamejoshab opened 1 month ago

itsamejoshab commented 1 month ago

📩 Deployment Method

Official installation package

📌 Version

2.15.3

đŸ’» Operating System

macOS

📌 System Version

14.5

🌐 Browser

Chrome

📌 Browser Version

128.0.6613.138

🐛 Bug Description

Configuring NextChat packaged client to work with Azure makes the response text get cutoff in the chat.

This was not an issue with prior older versions of the client, but this happens to me after I upgraded to 2.15.2

image

đŸ“· Recurrence Steps

Model Provider: Azure Azure Endpoint: https://{resource-url}/openai Custom Models: -all,{modelname}@azure={deploymentName} Max Tokens: 4000 Attached Message Count: 5 History Compression Threshold: 5000 Memory Prompt: yes

🚩 Expected Behavior

The entire response should be shown instead of only about 10-15 tokens

📝 Additional Information

No response

nextchat-manager[bot] commented 1 month ago

Please follow the issue template to update description of your issue.

Issues-translate-bot commented 1 month ago

Bot detected the issue body's language is not English, translate it automatically.


Title: [Bug]

H0llyW00dzZ commented 1 month ago

📩 Deployment Method

Official installation package

📌 Version

2.15.2

đŸ’» Operating System

macOS

📌 System Version

14.5

🌐 Browser

Chrome

📌 Browser Version

128.0.6613.138

🐛 Bug Description

Configuring NextChat packaged client to work with Azure makes the response text get cutoff in the chat.

This was not an issue with prior older versions of the client, but this happens to me after I upgraded to 2.15.2

image

đŸ“· Recurrence Steps

Model Provider: Azure Azure Endpoint: https://{resource-url}/openai Custom Models: -all,{modelname}@Azure={deploymentName} Max Tokens: 4000 Attached Message Count: 5 History Compression Threshold: 5000 Memory Prompt: yes

🚩 Expected Behavior

The entire response should be shown instead of only about 10-15 tokens

📝 Additional Information

No response

I am pretty sure these are network issues due to instability caused by connecting directly to the endpoint. This is unlike using an external endpoint mechanism, which has been very stable (as tested on One Api hosted on my website https://oneapi.b0zal.io/).

itsamejoshab commented 1 month ago

If someone else is using Azure and also the same versoin 2.15.3, it'd be nice to get some comments. I feel pretty certain it isn't network instability as it is 100% failing with the 2.15.3 client, and 0% failing if I rollback a few releases ago. Also I have tested this endpoint extensively in other contexts and have a great network, and never had any latency or network concerns.