Open jlia0 opened 1 year ago
Hi, I tinkered a bit with the accessibility API to try to extract the text directly from the apps instead of using an OCR, but did not achieve much. If you have any good reference material it would be great, or you can make a PR :)
Here's one that cyte2 was referencing from: https://github.com/tmandry/AXSwift
However I am not sure if there is a Python API for it, do you mind sharing some of your tinkering code?
I believe we still need OCR for extracting the text, the accessibility api is for extracting "metadata" like url or window contexts.
Unfortunately I don't seem to have kept my tinkering code :/ I tried to use ORCA screen reader (https://github.com/GNOME/orca) but it was not the right tool I think
https://kevinchen.co/blog/rewind-ai-app-teardown/
^^^ I think this would probably help
Yes this blog post was very helpful :)
Is this connected to the accessibility API to retrieve context information (like url, app name, etc) yet?