Any way to work around the inability of enqueued playlist items to include outputSpeech?

alexa-samples / skill-sample-nodejs-audio-player

An Alexa Skill Sample showing how to play long form audio in 3P-skills

Other

470 stars 319 forks source link

Looks like you can't prefix an enqueud bit of audio with a response.speak, whereas you can with the initial start of the audioplayer: https://developer.amazon.com/docs/custom-skills/audioplayer-interface-reference.html

The response cannot include: Any standard properties such as outputSpeech, card, or reprompt.

In other words, this.response.speak(speechOutput).audioPlayerPlay(behavior, url, token, expectedPreviousToken, offsetInMilliseconds); works for the first tracks, but triggers an error if tried with any enqueued track when it comes up to time to play.

Can anyone think of a way of breaking the playlist into issuances of single track commands, rather than a playlist, so I can prefix each track with its title? Right now, it'll only speak the title of the first track, because of that limitation.

It actually depends when your skill is sending the enqueue response.
While the stream is playing, the skill session is ending. This why you can not not send text to speech (TTS) as a reply to event like Playback Nearly Finished or Playback Stopped.

But when customer re-engage your skill using the invocation name, a new session is started and you have the ability to send back both TTS and an Audio Player directive.

To address your question : I would suggest to enqueue a very short file speaking the next title name and artist. These can be generated in real time with AWS Polly for example https://aws.amazon.com/polly/

On PlaybackNearly finished on a song, the skill would generate and enqueue the MP3 announcing the next track. On PlaybackNearlyFinished on the announcement MP3, your skill would enqueue the next track.

alexa-samples / skill-sample-nodejs-audio-player

Any way to work around the inability of enqueued playlist items to include outputSpeech? #67