k4l1sh / alexa-gpt

A tutorial on how to use ChatGPT in Alexa
MIT License
177 stars 38 forks source link

Alexa's 10 second timeout for skill response workaround #10

Open ghzeni opened 7 months ago

ghzeni commented 7 months ago

Hello! I've been using this skill for quite a lot of things recently, but the 10-second timeout really is a buzzkill for me.

For those who are not aware, this is what I'm talking about.

Since this limitation is for the skill to provide a response to the user and not for a request inside the skill to provide a response, I was wondering if it isn't possible to create a separate thread for the request to run, and in case the request takes up more than 7 seconds, provide an initial response of alexa saying something like "One moment, please." and provide the actual answer afterwards.

Any feedbacks on this? Thanks!

xinyonghu2015 commented 7 months ago

Same question .During testing, I frequently encounter a timeout error, with the message: "There was a problem with the requested skill's response."

I believe this issue arises when the response from the LaunchRequestHandler exceeds the 8-second limit imposed by the Alexa service. In our current implementation, the handler calls the GPT-3 API, processes the data, and generates a response. This process sometimes takes longer than the allotted time frame, particularly when the prompt content is complex, resulting in a timeout error.

However, I've observed that when the prompt content is relatively simple, the GPT-3 API response time is usually within the 8-second limit and the skill functions as expected.

xinyonghu2015 commented 7 months ago

At present, after I switched the api from the original gpt-3.5-trubo-0613 to gpt-3.5-trubo-1106, the speed is actually faster. The response time is less than 8 seconds, which can temporarily solve the problem, but it is not solved from the code. of

k4l1sh commented 7 months ago

gpt-3.5-turbo-1106 model is very fast and helps avoid this issue. Ideally, we would like the Alexa skill to speak each part of the text as it's being made, similar to what's shown in this streaming example. But Alexa Skills work in a way where they can't do this, they can't say each word one by one as separate responses.

A simple solution is to use something called Progressive Response. This keeps the user listening while the skill gets the whole text ready. You can learn more about how to do this from the Amazon Alexa guide on Progressive Responses

"Your skill can send progressive responses to keep the user engaged while your skill prepares a full response to the user's request. A progressive response is interstitial SSML content (including text-to-speech and short audio) that Alexa plays while waiting for your full skill response."

ghzeni commented 7 months ago

A simple solution is to use something called Progressive Response. This keeps the user listening while the skill gets the whole text ready. You can learn more about how to do this from the Amazon Alexa guide on Progressive Responses

This is precisely what I was looking for!! I think this would be a great addition to the project. I'm gonna try to implement this in the next few days.

xinyonghu2015 commented 7 months ago

A simple solution is to use something called Progressive Response. This keeps the user listening while the skill gets the whole text ready. You can learn more about how to do this from the Amazon Alexa guide on Progressive Responses

This is precisely what I was looking for!! I think this would be a great addition to the project. I'm gonna try to implement this in the next few days. Hi Bro,If you modified the code, can you share it?

ghzeni commented 7 months ago

Hi Bro,If you modified the code, can you share it?

Hey! I still haven't gotten around to it, but as soon as I modify it i'll share it :)

taueres commented 4 months ago

I solved this issue by changing this line:

messages.append({"role": "user", "content": new_question + ". Write max 50 words in the response."})

This limits the response size to around 50 words which really improves the response time.

k4l1sh commented 4 months ago

Limiting the answers in the prompt to 50 words really avoids long wait times, it is a good and simple temporary solution. I put this phrase in system content instead of user content to avoid repeating the phrase. Changes done in commit 65945ae

badmin-c commented 1 month ago

It seems like the Progressive Response feature feature is not an option to buy us any time while waiting for the response, as the manual clearly states:

"Note: Progressive responses don't change the overall time allowed for a response. When a user invokes a skill, the skill has approximately eight seconds to return a full response. The skill must finish processing any progressive responses as well as the full response within this time." (https://developer.amazon.com/en-US/docs/alexa/custom-skills/send-the-user-a-progressive-response.html)

I wasn't too happy with the workaround, so I fiddled a bit with different models and custom instructions. I found that I am able to use gtp-4o (which really is a lot faster then former gpt-4 models) using

"max_tokens": 1000,

and

messages = [{"role": "system", "content": "Answer all questions concisely and precisely"}]

Sometimes the first answer tends to be really brief, but follow up questions like "Please be more precise" or "Tell me more details" work just fine, even on complex topics like "What do we know about the universe" or "Explain the connection between our universe and quantum mechanics".

I feel like this is a much smoother approach than the 50 words max hard-limit.