I am getting close to being able to work on this in earnest. I am seeking direction from the community.
Background
I am not visually impaired. I had a good friend who is totally blind in High School. We spent lots of time together. I learned Grade II Braille. My wife has low-vision, requiring high-contrast and large text. She uses TTS (iphone, iMac, Plex, RoKu, etc.). I helped add accessibility to a Java product so I am aware of a number of the issues and accepted practices.
I started off completely ignorant of the TTS abilities of Linux and Windows (Okay, I know some), but not at the programming level. Specifically I did not know much about speechd.
I have torn up service.kodi.tts in my refactoring so that only a couple of the voices work at the moment, and it is buggy, but works good enough for my needs.
The three major areas that need work are:
Voicing/Voicing engine. The actual voicing of text needs to work well and produce better than old 'computer' voicing
Integration with Kodi windows should be much better. There are lots of windows that are voiced poorly or not at all. Likely this will be a slow process because it is likely to require cooperation from window/dialog owners. Most likely, cooperation from Kodi itself will be required.
Improvements to the TTS UI itself, including settings. I have already made a number of changes in this area. Changes include: dynamically updating the TTS behavior as the changes are made. Before, you made a bunch of changes, hit enter and prayed. Now, if you change the pitch or the language, it changes immediately. This way you can more easily find the settings that you want. A lot more work needs to be done in this area
Voicing Engines
From what I can tell from my shallow survey of the technologies/options there are basically three types of voicing engines:
Old technology, widely available, free voicing engines providing more or less the same capabilities. The voicing engines here include espeak, festival, flite, etc.
Accessing more sophisticated voicing engines, either embedded with the O/S (windows) or from a commercial library (not free). Engines here include jaws and what Windows supplies
The third category are voicing engines which are remote. Some providers offer a limited number of free translations per month. Some have fairly restrictive terms and conditions on the produced voice, while others are less so. Engines here include ResponsiveVoice, Google and Amazon.
I tend to think that less effort should be put into supporting the older technology engines. I will pick two (probably espeak and festival) and ignore the rest for now.
I haven't done anything for the second category of sophisticated voicing engine libraries, nor speechd.
At least for Linux, it seems that the third category of voicing engines (remote) is the way to go for now. The biggest issues I see here are:
1- Licensing/Cost
2- Api restrictions or quirks
3- The need to cache the translated voicings (otherwise the delays will be maddening)
I have spent most of my effort getting ResponsiveVoice to work. ResponsiveVoice works fairly well, at least for English. Each free account can translate up to 1 million characters per month. You may hit the limit in the first month, but before too long this should not be an issue (with caching).
How to Give Feedback
For now, please either send me a note or open an issue. We can use the issues as a forum for discussion (at least for now).
I need to know what direction(s) I should head. I suspect the highest priority is to band-aid what I have and release it. It has been very, very long since any TTS has worked on Kodi.
I'm visually impaired myself and have Python knowledge, though not with Kodi. So if you could somehow point me at the right directions to get started developing, I'm happy to assist!happy to give this a try.
I am getting close to being able to work on this in earnest. I am seeking direction from the community.
Background
I am not visually impaired. I had a good friend who is totally blind in High School. We spent lots of time together. I learned Grade II Braille. My wife has low-vision, requiring high-contrast and large text. She uses TTS (iphone, iMac, Plex, RoKu, etc.). I helped add accessibility to a Java product so I am aware of a number of the issues and accepted practices.
I started off completely ignorant of the TTS abilities of Linux and Windows (Okay, I know some), but not at the programming level. Specifically I did not know much about speechd.
I have torn up service.kodi.tts in my refactoring so that only a couple of the voices work at the moment, and it is buggy, but works good enough for my needs.
The three major areas that need work are:
Voicing Engines
From what I can tell from my shallow survey of the technologies/options there are basically three types of voicing engines:
Old technology, widely available, free voicing engines providing more or less the same capabilities. The voicing engines here include espeak, festival, flite, etc.
Accessing more sophisticated voicing engines, either embedded with the O/S (windows) or from a commercial library (not free). Engines here include jaws and what Windows supplies
The third category are voicing engines which are remote. Some providers offer a limited number of free translations per month. Some have fairly restrictive terms and conditions on the produced voice, while others are less so. Engines here include ResponsiveVoice, Google and Amazon.
I tend to think that less effort should be put into supporting the older technology engines. I will pick two (probably espeak and festival) and ignore the rest for now.
I haven't done anything for the second category of sophisticated voicing engine libraries, nor speechd.
At least for Linux, it seems that the third category of voicing engines (remote) is the way to go for now. The biggest issues I see here are: 1- Licensing/Cost 2- Api restrictions or quirks 3- The need to cache the translated voicings (otherwise the delays will be maddening)
I have spent most of my effort getting ResponsiveVoice to work. ResponsiveVoice works fairly well, at least for English. Each free account can translate up to 1 million characters per month. You may hit the limit in the first month, but before too long this should not be an issue (with caching).
How to Give Feedback
For now, please either send me a note or open an issue. We can use the issues as a forum for discussion (at least for now). I need to know what direction(s) I should head. I suspect the highest priority is to band-aid what I have and release it. It has been very, very long since any TTS has worked on Kodi.