daisy / kb

DAISY Knowledge Bases
https://kb.daisy.org
16 stars 9 forks source link

TTS implementation(s) of DPUB-ARIA #63

Open dginev opened 1 year ago

dginev commented 1 year ago

Hi @mattgarrish ! I was reading through #55 and was wondering if I may ask a clarifying comment on the TTS remarks there?

Is there an existing reference implementation for text-to-speech that takes DPUB-ARIA markup as input? Or maybe even better - an actively developed open project that aims to be a reference implementation? I am currently looking for a conforming TTS engine that I can use for testing the outputs of our generator for HTML/ePub formatted STEM documents.

My context: I am investigating the existing markup strategies for STEM accessibility and naturally encountered the DPUB-ARIA spec pages. In particular, I am interested in improving the accessibility of ar5iv, which I maintain, and our generator tool latexml, respectively.

If I can anchor into a reliable TTS engine that has significant support for DPUB-ARIA, that could make life for us much easier. Would you be able to offer a broad overview of implementations as of mid-2023?

Thank you!

mattgarrish commented 1 year ago

Yes, there is better adoption of ARIA roles, and specifically the DPUB-ARIA extensions, than of general TTS technologies like SSML and PLS. I wouldn't put ARIA roles in the same category, as they're not designed to improve the TTS ability of assistive technologies, rather to help announce various structures (among other uses).

I'm not sure if there have been any recent changes, but where things last stood I believe only VoiceOver wasn't announcing the DPUB-ARIA role names, instead exposing their superclass names (e.g., "landmark" instead of "bibliography"). @clapierre has been working on implementation consistency, so he might be able to provide a more detailed update.

dginev commented 1 year ago

Thanks again @mattgarrish . To be as clear as possible, I am searching for a list of project names that aspire to (or already have) DPUB-ARIA support, so that I can evaluate which tools would be a good fit for us.

For the purpose of such a question, feel free to assume that I don't know the name of a single project in this space and I am looking to learn about all of them. So far I have learned of one - VoiceOver, although you seem to be using it as a negative example.