jhudsl / ari

:dancers: The Automated R Instructor
https://jhudatascience.org/ari/
Other
146 stars 37 forks source link

SSML for AWS POLLY #29

Closed masanao-yajima closed 3 years ago

masanao-yajima commented 4 years ago

Great package. Is there any plans to allow for SSML? There might be more to this, but it seems like it's just adding ",..." at the end of this line in ari_spin or creating a specific list as a parameter to be passed into ari_spin and forwarded to tts.

wav <- text2speech::tts(text = paragraphs[i], voice = voice, service = service, bind_audio = TRUE)

seankross commented 4 years ago

I agree it would be a good idea to support folks who want to sue SSML.

rpietro commented 3 years ago

Definitely agree with that, support for SSML would be outstanding

machiavelli-a commented 3 years ago

I agree, would be great to add SSML

rodrigohuerta commented 3 years ago

This could truly improve AWS POLLY.

drmeene commented 3 years ago

yup, SSML would definitely enhance the quality of the videos. Thanks for working on Ari, very useful package

Rexmontine commented 3 years ago

Before I knew of Ari I was automating some of the same processes, but creating the sound files directly in Polly. The tags do make a major difference if you know what you are doing. Now that I started using Ari, I do miss the SSML syntax. Do you guys plan on adding it?

amitagrey commented 3 years ago

SSML support will take ARI usability to another level. Eagerly looking forward to it.

paulsenluiza commented 3 years ago

I agree. SSML support would be great.

lucasxteixeira commented 3 years ago

SSML support would be very useful.

mgkulik commented 3 years ago

Yes please! With SSML support the package will get even better.

lucasnanni commented 3 years ago

I fully agree with that. Support for SSML would be amazing!

keilabcs commented 3 years ago

I fully agree with that. SSML support would be great.

muschellij2 commented 3 years ago

This is not going to be added to ari any time soon. Pull requests are welcome

muschellij2 commented 3 years ago

The package uses text2speech for synthesis, which wraps aws.polly, as per https://github.com/jhudsl/text2speech/blob/f46274ef70950e295ddbeb1e7a34ab2ee0afcfa8/R/tts_backend.R#L79. We see the ... passes through to aws.polly::get_synthesis at https://github.com/jhudsl/text2speech/blob/f46274ef70950e295ddbeb1e7a34ab2ee0afcfa8/R/tts_backend.R#L124. Thus, you should be able to use tts_amazon(ssml = TRUE) as per https://github.com/cloudyr/aws.polly/blob/master/R/synthesize.R#L26.

One of the issue is that text2speech splits the text based on number of characters, so you may mess up some things with SSML.

So if you need SSML, then I'd recommend synthesizing the audio outside of ari, and then passing in a list of WAV objects or wav file names into the audio argument of ari_stitch: https://github.com/jhudsl/ari/blob/master/R/ari_stitch.R#L71