mapbox / mapbox-navigation-ios

Turn-by-turn navigation logic and UI in Swift on iOS
https://docs.mapbox.com/ios/navigation/
Other
861 stars 314 forks source link

SystemSpeechSynthesizer should use SSML on iOS 16 #4057

Open 1ec5 opened 2 years ago

1ec5 commented 2 years ago

When falling back to the VoiceOver text-to-speech engine, we currently pass AVSpeechSynthesizer an AVSpeechUtterance created from the plain text representation of the spoken instruction, in some cases marked up with the IPA pronunciation of the road name. But the spoken instruction’s SSML representation contains more information about prosody, volume, embedded foreign languages, and how to interpret abbreviations and numbers, which we’re dropping on the floor.

iOS 16 introduces AVSpeechUtterance(ssmlRepresentation:) for creating an utterance directly from SSML source code. For iOS 16 and above, we would short-circuit the code that marks up the attributed string with IPA in favor of this initializer. I’m unsure how AVSpeechUtterance handles the proprietary Amazon Polly SSML attributes and tags that appear in the Mapbox Voice API output. We may need to strip them out using either a regular expression (fragile) or XMLDocument (slow). While we’re at it, we could check whether Apple has fixed some of the caveats we identified in #624: that the Alex voice doesn’t support attributed text and that the other voices don’t support the IPA symbols ɡ and ɹ.

https://github.com/mapbox/mapbox-navigation-ios/blob/4ad5f99b96f962709020ae5ce0d408fe30c6e533/Sources/MapboxNavigation/SystemSpeechSynthesizer.swift#L86 https://github.com/mapbox/mapbox-navigation-ios/blob/9e62f088e12da8a0cc2ab418b6bdf1e716e056be/Sources/MapboxNavigation/RouteVoiceController.swift#L8-L47

lahariganti commented 2 years ago

iOS 16 introduced a bunch of new voices. An addition to this would be the ability for the client to pipe in a voice as well (unless I am missing an already existing ability to do so). Looks like updating the protocol func speak(_ instruction: SpokenInstruction, during legProgress: RouteLegProgress, locale: Locale?) {} to accept a voice identifier should get the job done.

azarovalex commented 2 years ago

@lahariganti I was looking into AVSpeech changes and it looks like the default voice we used for offline instructions AVSpeechSynthesisVoiceIdentifierAlex is no longer shipped with iOS 16. It might be a bug in iOS itself.

There are other voices available but all of them are "compact", which means they sound really robotic and you wouldn't want to use them anyway. You probably don't want to push the end user to download synthetic voices somewhere deep in iOS Settings either.

So for now we use high quality voice only when we have network connection, using Mapbox Speech.

To answer your question, NavSDK has an option to delegate voice synthesis to the app itself, see a tutorial here: https://docs.mapbox.com/ios/navigation/examples/custom-voice-controller/