ChatGPTNextWeb / ChatGPT-Next-Web

A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). äž€é”źæ‹„æœ‰äœ è‡Șć·±çš„è·šćčłć° ChatGPT/Gemini ćș”甚。
https://app.nextchat.dev/
MIT License
75.13k stars 58.94k forks source link

[Bug] Chat Repsonse cutoff with Azure Configured #5474

Open itsamejoshab opened 2 days ago

itsamejoshab commented 2 days ago

📩 Deployment Method

Official installation package

📌 Version

2.15.2

đŸ’» Operating System

macOS

📌 System Version

14.5

🌐 Browser

Chrome

📌 Browser Version

128.0.6613.138

🐛 Bug Description

Configuring NextChat packaged client to work with Azure makes the response text get cutoff in the chat.

This was not an issue with prior older versions of the client, but this happens to me after I upgraded to 2.15.2

image

đŸ“· Recurrence Steps

Model Provider: Azure Azure Endpoint: https://{resource-url}/openai Custom Models: -all,{modelname}@azure={deploymentName} Max Tokens: 4000 Attached Message Count: 5 History Compression Threshold: 5000 Memory Prompt: yes

🚩 Expected Behavior

The entire response should be shown instead of only about 10-15 tokens

📝 Additional Information

No response

nextchat-manager[bot] commented 2 days ago

Please follow the issue template to update description of your issue.

Issues-translate-bot commented 2 days ago

Bot detected the issue body's language is not English, translate it automatically.


Title: [Bug]

H0llyW00dzZ commented 1 day ago

📩 Deployment Method

Official installation package

📌 Version

2.15.2

đŸ’» Operating System

macOS

📌 System Version

14.5

🌐 Browser

Chrome

📌 Browser Version

128.0.6613.138

🐛 Bug Description

Configuring NextChat packaged client to work with Azure makes the response text get cutoff in the chat.

This was not an issue with prior older versions of the client, but this happens to me after I upgraded to 2.15.2

image

đŸ“· Recurrence Steps

Model Provider: Azure Azure Endpoint: https://{resource-url}/openai Custom Models: -all,{modelname}@Azure={deploymentName} Max Tokens: 4000 Attached Message Count: 5 History Compression Threshold: 5000 Memory Prompt: yes

🚩 Expected Behavior

The entire response should be shown instead of only about 10-15 tokens

📝 Additional Information

No response

I am pretty sure these are network issues due to instability caused by connecting directly to the endpoint. This is unlike using an external endpoint mechanism, which has been very stable (as tested on One Api hosted on my website https://oneapi.b0zal.io/).