generate audio versions of content

cu-mkp / sandbox

The “Sandbox” space makes available a number of resources that utilize and explore the data underlying "Secrets of Craft and Nature in Renaissance France. A Digital Critical Edition and English Translation of BnF Ms. Fr. 640" created by the Making and Knowing Project at Columbia University.

https://cu-mkp.github.io/sandbox/

6 stars 1 forks source link

generate audio versions of content #97

Open tcatapano opened 2 years ago

tcatapano commented 2 years ago

particularly essays, but could also try entries

using Amazon Polly, Google Text to Speech...

see gTTS-cli https://gtts.readthedocs.io/en/latest/cli.html for generating files...

njr2128 commented 2 years ago

And reach out to @caro27 for advice about screen readers and to brainstorm!

njr2128 commented 2 years ago

We currently have two other "audio" projects in the sandbox that we could link together:

https://cu-mkp.github.io/sandbox/docs/fa21_zayas+waters_elliot+mac_final-project-soundscape.html
paternoster recitations for time-keeping as part of https://cu-mkp.github.io/sandbox/docs/burnsalve.html

And some info from PHS: The Renaissance Society of America (@RSAorg) tweeted at 6:30 AM on Tue, May 17, 2022: Take a look at this collection of sites that use acoustic and visual modeling tools to recreate events in the life of John Donne: https://t.co/TL1QtSunlT @NCState @NEHgov #RenTwitter https://t.co/DSfnZKnDuq (https://twitter.com/RSAorg/status/1526510373553131520?t=JFbz0NpweOp5M9DAO0LIdg&s=03)

caro27 commented 2 years ago

Hi sorry for my delay. let me know what you need!

tcatapano commented 1 year ago

Created some audio files for medicine entries using python/gTTS. See: https://github.com/cu-mkp/sandbox/tree/issue97/audio

tcatapano commented 1 year ago

Can also try Amazon Polly (https://docs.aws.amazon.com/polly/index.html), maybe using SSML(https://docs.aws.amazon.com/polly/latest/dg/ssml-synthesize-speech-cli.html) for more refined output.

tcatapano commented 1 year ago

used gtts cli to generate audio for Hagadorn essay. Converted html to text using pandoc, the text was used as input to gTTS. "Pretty Print" linebreaks/whitespace cause silences in the generated audio. The compression and truncate silence effects in audacity help smooth out some of the chopiness, but its probably best to normalize the whitespace prior to input into gTTS.