Extract text from a document (textract) and convert it into a natural sounding synthesised speech (Cloud Text-To-Speech), which is able to leverage Deepminds Wavenet models.
Available source formats (from textract
)
GCP
Host Machine
/doc2audiobook/data/input
: directory to hold all input files./doc2audiobook/data/output
: directory to store all output files./doc2audiobook/.secrets/client_secret.json
: GCP authentication token.git clone git@github.com:danthelion/doc2audiobook.git
cd doc2audiobook
docker build -t doc2audiobook .
Make sure to put your documents in the folder that is mapped to /data
before running!
List available voices
docker run \
-v /doc2audiobook/data:/data:rw \
-v /doc2audiobook/.secrets/client_secret.json:/.secrets/client_secret.json:ro \
doc2audiobook -list-voices
Convert all documents in the mapped input folder to audiobooks using the en-GB-Standard-C voice.
docker run \
-v /doc2audiobook/data:/data:rw \
-v /doc2audiobook/.secrets/client_secret.json:/.secrets/client_secret.json:ro \
doc2audiobook --voice en-GB-Standard-C
Convert a single document in the mapped input folder to an audiobook using the en-GB-Standard-C voice.
docker run \
-v /doc2audiobook/data:/data:rw \
-v /doc2audiobook/.secrets/client_secret.json:/.secrets/client_secret.json:ro \
doc2audiobook --voice en-GB-Standard-C --input test_input.txt