Closed talkingnews closed 6 years ago
It actually depends when your skill is sending the enqueue response.
While the stream is playing, the skill session is ending. This why you can not not send text to speech (TTS) as a reply to event like Playback Nearly Finished or Playback Stopped.
But when customer re-engage your skill using the invocation name, a new session is started and you have the ability to send back both TTS and an Audio Player directive.
To address your question : I would suggest to enqueue a very short file speaking the next title name and artist. These can be generated in real time with AWS Polly for example https://aws.amazon.com/polly/
On PlaybackNearly finished on a song, the skill would generate and enqueue the MP3 announcing the next track. On PlaybackNearlyFinished on the announcement MP3, your skill would enqueue the next track.
Looks like you can't prefix an enqueud bit of audio with a response.speak, whereas you can with the initial start of the audioplayer: https://developer.amazon.com/docs/custom-skills/audioplayer-interface-reference.html
In other words,
this.response.speak(speechOutput).audioPlayerPlay(behavior, url, token, expectedPreviousToken, offsetInMilliseconds);
works for the first tracks, but triggers an error if tried with any enqueued track when it comes up to time to play.Can anyone think of a way of breaking the playlist into issuances of single track commands, rather than a playlist, so I can prefix each track with its title? Right now, it'll only speak the title of the first track, because of that limitation.