dialect-app / dialect

A translation app for GNOME.
https://dialectapp.org/
GNU General Public License v3.0
595 stars 69 forks source link

Feature Request: translate from image/screen capture #397

Open 0chroma opened 1 month ago

0chroma commented 1 month ago

Hello! Often times I'll be using an app or watching a video and will want to translate something, but can't since the text isn't selectable. Or, maybe I'll want to translate something physical that I could take a picture of.

I think it'd be great to have a feature like on mobile translation apps where you can take a picture and translate any text in the image.

I would be happy to implement this feature, since it's something I'd use very often. Thank you!

rafaelmardojai commented 1 month ago

Usually we recommend Frog or any other OCR software to get the text and then use Dialect. But I agree that it would be nice to have such feature, and we have it in the roadmap but it's not high priority.

If you would like to work on such feature here https://github.com/dialect-app/dialect/pull/331#issuecomment-1607804356 you can find some pointers on how we would like it to be implemented. Currently I don't know if we would want a "providers" approach for this as we do for translations and text-to-speech, but that's up to discuss.

IMO the more tricky part of this feature if we work with some offline foss library like tesseract is getting the right heuristics so the detected text keeps it's structure, but I guess it's already figured out in other projects.