Closed ggallo closed 1 month ago
@ggallo We'll make USE_STREAMING be deprecated because it's hard to maintain the option with keeping backward compatibility to add new features. Thank you for your understanding.
This issue has been labeled as "stale" due to no response by the reporter within 1 month (and 14 days after last commented by someone). And it will be closed automatically 14 days later if not responded.
This issue has been closed due to no response within 14 days after labeled as "stale", 14 days after last reopened, and 14 days after last commented.
Describe the bug
If you deploy with WS streaming disabled, requests to regenerate LLM responses will use WS regardless. This is a hazard for customers that need to deploy without WSS due to APIGW lack of private WS APIs.
To Reproduce
Screenshots
Notice the initial POST, then multiple requests to regenerate are opening WSS connections