daisy / pipeline-modules

Modules for the DAISY Pipeline project
4 stars 5 forks source link

Ability to get a list of languages used in a document and corresponding TTS voices for a given voice configuration and CSS style sheets #96

Open bertfrees opened 5 months ago

bertfrees commented 5 months ago

It would be nice if we could extend the existing /voices API endpoint for this.

bertfrees commented 3 months ago

I have figured out how I will extend the /voices endpoint. Currently, you can only include an XML config document in the post request. In the future, you can also have a multipart request that includes also an input document (DTBook or (X)HTML) and a number of CSS/Sass style sheets, just like with the /stylesheet-parameters endpoint. The multipart request consists of up to three (optional) parts: a "voice-config" part (the TTS config XML), a "source-analysis-request" part (an XML document with a syntax identical to the "stylesheet-parameters-request" syntax for the /stylesheet-parameters endpoint), and a "source-data" part (a zip containing references files). When the "source-analysis-request" is made, the source document and style sheets will be used to determine all the voices that will be used to convert the document to speech.

The syntax of the response is the same as before. But only the voices that will be used are included.

With this method there is unfortunately no way to get the list of languages and dialects used in the document. You can only try to infer the languages from the voices used, because a voice will only be used for text in the language it is meant for, unless you specify specific voices in your voice-family. Inferring the dialects is trickier however, because of the "Default for language-x" options.