Open haruncetinkaya306 opened 6 months ago
For the moment, there are multiple add-ons attempting to do this. Have you tried them?
I know it is not the most satisfying solution.
Also, does Jaws charge extra for this? Chat GPT tends to have a cost.
Actually there are only three in the store.
AI Content Describer; Open AI; XposeImage Captioner.
Some will be incompatible depending on your NVDA version, and you will have to use the override compatibility option to try them, or use a version from the beta or dev channels.
@XLTechie Using this feature in Jaws is free. I don't understand, which of these is possible to describe photographs?
I have not used any of those, but probably AI Content Describer is the most likely to work. This is a guess.
It's nice to have this as a plugin, but I would request that this feature be added to NVDA's core. @gerald-hartig
@haruncetinkaya306 Maybe they slightly increased the price of JAWS to acomidate for this new functionality?
@haruncetinkaya306 - hi, please do not encourage or discuss illegal software usage here, thanks
This is something that should be and is already successfully held in an add-on due to following reasons:
@Adriani90 If it is true that this feature is paid, of course it should not be added to NVDA's core. Because this goes against the reason why NVDA was created. However, what I don't understand is how a screen reader that has this feature can do this for free. Because I do not encounter any fee requests when using this feature. Jaws already had a smart image recognition feature in 2019. However, it provides limited information. Now this feature sends images to artificial intelligence. He gets the detailed description from them free of charge.
Hello, I just checked the feature again, when I set Jaws to run for 40 minutes and try it, it says activate the license. This may mean that there is a fee.
Just to be clear, if we were to integrate this into NVDA it would have to be a free component. This is not impossible, the most likely options at this stage would be to install open source AI models as an optional component of NVDA or using an inbuilt Windows service like used for OCR if one is created for CoPilot. Both of these would avoid making internet requests to analyse the image. Alternatively, we would have to find a free internet based service, but unless there was strong user trust in such a service handling the data, this would probably best serve as remaining an add-on.
I think it is good to let this open since the first alternative you propose sound atractive even though far away for the near future. I think using either an integrated windows service when available or building an open source AI model into NVDA installation seems the best solution, other internet based requests will not be accepted especially in corporate environments.
Be my Eyes with its Be my AI is currently working on Windows integrations, so picture description with Be my AI will be available also on desktops / laptops. So having this feature built into the screen reader will not be really necessary in the near future. Moreover, Be my AI is focusing on this area as well, so improvements to picture descriptions can be proposed to be my eyes developers which is more efficient I guess.
NV Access had similar questions asked of us at CSUN around when we would be bringing AI image recognition into the core. There's a lot to say on the topic, but very briefly our position at the moment is that we're very excited by the potential benefits that this technology can bring and the add-on system is a marvellous way of exploring these benefits and experimenting on how best to integrate them into a screen reader.
We feel that bringing this functionality into the core requires the technology to first tick the following boxes:
In my personal opinion it feels like every day the technology is improving and that none of these pre-requisites will prove to be intractable.
Finally, be my AI is now available in the Windows store and it works amazingly well, also with describing the screen, not only images. It can also summarize things etc.: https://www.bemyeyes.com/blog/be-my-eyes-desktop-app-on-windows
To be honnest, if nVDA brings any AI feature to the core, it should not really focus on images description, but rather focus on navigation assistance on the screen, e.g. navigating and performing actions in inaccessible software etc. Image description is now with be my AI on Windows really nice feature which is for free. NVDA developers could rather invest time in making AI really help in inaccessible places, e.g. telling NVDA to draw something on a concept board, whereby NVDA would draw it for you with the mouse. This would bring much more value than the pure image description feature.
Is your feature request related to an issue? Please define.
Smart image recognition feature was introduced in the March update of Jaws. This feature uses Chat GPT and Google Gemini to describe photos in detail and eliminates all accessibility issues related to photos. Especially Chat Gpt does this job very well.
Describe your desired solution
It would be nice to bring such a feature to NVDA. I know that those who follow screen readers closely have noticed this feature. I think there will be such a demand for NVDA in the future when all users are aware of this feature.
Describe the alternatives you considered
None.
Additional context