Image alt text is not read by TTS

prashantverma2014 commented 6 months ago

For example when DTBook xml is converted to DAISY 3 and Azure voice is used, the content of the image alt text is not present in audio files of the output. If the alt text has numbers, it is recorded.
To recreate the issue, the document, its XML and DAISY 3 output is attached.

DAISY 3.zip Sinking_of_the_Titanic-model_answer.docx Sinking_of_the_Titanic-model_answer_DtbookXML_202403262109546774.zip

bertfrees commented 5 months ago

This could be achieved using a CSS style sheet.

<imggroup>
  <img id="rId111"
       src="Sinking_of_the_Titanic-model_answer-Picture%2048.png"
       alt="Painting of a ship sinking by the bow, with people rowing a lifeboat in the foreground and other people in the water. Icebergs are visible in the background."/>
</imggroup>

The alt attribute in this example could be spoken by applying the following CSS:

@media aural {
  img[alt]::before {
    content: attr(alt);
  }
}

The CSS could be further improved by e.g. not speaking the alt text when a caption is present.

bertfrees commented 5 months ago

Let me know if you think this behavior makes sense to have as an option.

bertfrees commented 4 months ago

@marisademeglio @NPavie For this issue I'm thinking of doing the following:

Add a "stylesheet-parameters" option to dtbook-to-daisy3, dtbook-to-epub3, epub3-to-epub3 and epub-to-daisy.
Support a new <userAgentStylesheet> element inside the /stylesheet-parameters request document. The client is expected to include this element with a mediaType attribute corresponding to the input of the script:
- <userAgentStylesheet mediaType="application/x-dtbook+xml"/> for dtbook-to-pef, dtbook-to-daisy3 and dtbook-to-epub3
- <userAgentStylesheet mediaType="application/xhtml+xml"/> for html-to-pef, epub3-to-pef, epub3-to-epub3 and epub-to-daisy

Do you think it would be feasible to adapt the GUI? So include the userAgentStylesheet element, and make the job forms of dtbook-to-daisy3, dtbook-to-epub3, epub3-to-epub3 and epub-to-daisy scripts into a two-step process just like the braille scripts.

marisademeglio commented 4 months ago

@bertfrees can you open an issue in the UI repo for this?

bertfrees commented 4 months ago

Done: https://github.com/daisy/pipeline-ui/issues/227

bertfrees commented 2 months ago

Fixed by 1a3056bd / 24d622f524

daisy / pipeline-modules

Image alt text is not read by TTS #84