CorentinJ / Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time
Other
51.66k stars 8.66k forks source link

proposing accessibility changes for the gui #1046

Open king-dahmanus opened 2 years ago

king-dahmanus commented 2 years ago

hi, please use wxpython instead of qt, as wxpython has the highest degree of accessibility to screen readers. Thanks. I would pull request it if I knew enough python, but I'm still in the basics.

raccoonML commented 2 years ago

The major benefit of the toolbox is the audio visualizations, in the form of speaker embeds and spectrograms. If you don't need images, a very basic interface could suffice. Maybe this one works for you. https://huggingface.co/spaces/akhaliq/Real-Time-Voice-Cloning

king-dahmanus commented 2 years ago

thanks but I can't use this thing properly, it doesn't seam to do what I wanted it to do, what's more, I want to costomize the parameters, as well as exporting the models which aren't doable in this one.

On Sat, 9 Apr 2022 at 16:59, raccoonML @.***> wrote:

The major benefit of the toolbox is the audio visualizations, in the form of speaker embeds and spectrograms. If you don't need images, a very basic interface could suffice. Maybe this one works for you. https://huggingface.co/spaces/akhaliq/Real-Time-Voice-Cloning

— Reply to this email directly, view it on GitHub https://github.com/CorentinJ/Real-Time-Voice-Cloning/issues/1046#issuecomment-1094074214, or unsubscribe https://github.com/notifications/unsubscribe-auth/AT2FKJRLY5X6HRROGZZVFRTVEGSOZANCNFSM5SE5QBAA . You are receiving this because you authored the thread.Message ID: @.***>

raccoonML commented 2 years ago

I don't intend to work on this issue, but I suggest that you come up with detailed requirements to help a developer who is interested in solving this problem. What features do you need? Can you explain how a screen reader works, and what features in the UI work well with it? You need to help us understand why it is beneficial to use wxPython.

king-dahmanus commented 2 years ago

ok here goes nothing! So a screen reader, doesn't rely on ai to read the screen, it relies on the accessibility information provided by apps and then reads it. So for example, qt doesn't get along with screen readers very well it seams, not by default, and here's my reason why I suggested wxpython. wxpython, is a gui library which uses the default gui framework of the system(windows form on windows) which provides all the accessibility information the screen reader needs, it's the most accessible gui library in python. Hope that was useful

On Sat, 9 Apr 2022 at 20:37, raccoonML @.***> wrote:

I don't intend to work on this issue, but I suggest that you come up with detailed requirements to help a developer who is interested in solving this problem. What features do you need? Can you explain how a screen reader works, and what features in the UI work well with it? You need to help us understand why it is beneficial to use wxPython.

— Reply to this email directly, view it on GitHub https://github.com/CorentinJ/Real-Time-Voice-Cloning/issues/1046#issuecomment-1094112972, or unsubscribe https://github.com/notifications/unsubscribe-auth/AT2FKJS4JMDMJOUDPG2KVM3VEHL7JANCNFSM5SE5QBAA . You are receiving this because you authored the thread.Message ID: @.***>

raccoonML commented 2 years ago

Does the developer need to do anything special with wxPython to provide that accessibility info to the screen reader? Another way of stating the question is, if a wxPython interface is constructed by someone with absolutely no knowledge of this issue, will you get enough info from the screen reader?

We can assume that the dev is thoughtful enough to use text labels instead of images. Are there any other considerations for making an accessible UI?

king-dahmanus commented 2 years ago

it will be accessible if the text of the buttons, lists, combo boxes, and all the other items is labeled, if it isn't labeled internally, it'll just say button, combo box, etc without any kind of info as to what it contains etc.

On Sun, 10 Apr 2022 at 09:21, raccoonML @.***> wrote:

Does the developer need to do anything special with wxPython to provide that accessibility info to the screen reader? Another way of stating the question is, if a wxPython interface is constructed by someone with absolutely no knowledge of this issue, will you get enough info from the screen reader?

We can assume that the dev is thoughtful enough to use text labels instead of images. Are there any other considerations for making an accessible UI?

— Reply to this email directly, view it on GitHub https://github.com/CorentinJ/Real-Time-Voice-Cloning/issues/1046#issuecomment-1094216544, or unsubscribe https://github.com/notifications/unsubscribe-auth/AT2FKJSCKOVQOBZJYI7OAFDVEKFRJANCNFSM5SE5QBAA . You are receiving this because you authored the thread.Message ID: @.***>

raccoonML commented 2 years ago

Is there a way to do this with PyQT so we don't need to rewrite the interface?

king-dahmanus commented 2 years ago

don't know myself since I'm still learning the basics of python, but I'll have to learn both wxpython and py-qt so I could somehow simulate what wxpython has into qt

On Sun, 10 Apr 2022 at 15:38, raccoonML @.***> wrote:

Is there a way to do this with PyQT so we don't need to rewrite the interface?

— Reply to this email directly, view it on GitHub https://github.com/CorentinJ/Real-Time-Voice-Cloning/issues/1046#issuecomment-1094288379, or unsubscribe https://github.com/notifications/unsubscribe-auth/AT2FKJSLWOCI6I4CDMSRCZ3VELRXFANCNFSM5SE5QBAA . You are receiving this because you authored the thread.Message ID: @.***>