Closed mohsiniqbal368 closed 9 months ago
Hi, it is possible to implement it but you would need to rewrite part of the application.
Amazon Bedrock can stream the results back to the Lambda function instead of waiting for the full response: https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html
You could use WebSockets to stream the result back to the user, for example using AWS AppSync: https://serverlessland.com/patterns/appsync-bedrock-subscriptions-cdk
This blog post covers it in more detail: https://aws.amazon.com/blogs/mobile/connecting-applications-to-generative-ai-presents-new-challenges/
Closing this for now, please reopen if anything is unclear.
Is it possible to make it output text like ChatGPT as soon as it inferences the request and constructing the output text. Its keep on waiting for extended time and then prints the entire output. The ChatGPT style of start outputting the answer text and as user is reading the text keep adding the words to the answer. Not sure the model works as producing each individual output tokens or generates the entire output in one single processing activity and returns the whole text.