X-PLUG / mPLUG-Owl

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
https://www.modelscope.cn/studios/damo/mPLUG-Owl
MIT License
2.25k stars 171 forks source link

For the blind. #116

Open photogbill opened 1 year ago

photogbill commented 1 year ago

I have a friend that is blind and I realized that if this could be enabled to snap a quick photo and respond audibly, that it could help him gain a better sense of awareness. Could you add either an option to continuously monitor a folder and load the latest image?

I'm not a coding guy myself, but it would be really awesome if this could help those that lack sight themselves.

Thank you regardless, Bill

photogbill commented 1 year ago

His screen reader could handle the audio part, mainly just being able to quickly load an image live would be awesome. Perhaps a folder that is constantly monitoring for images, and then a default text query against any image from that folder?