Open hemangjoshi37a opened 6 months ago
Hi @hemangjoshi37a , thank you for your interest!
Unfortunately at this time, OpenAdapt does not currently support linux for two reasons:
This is non-trivial additional effort for minimal gain. According to https://gs.statcounter.com/os-market-share/desktop/worldwide, Linux currently occupies about 4% of global desktop OS market share.
The input control library we use (pynput) does not support differentiating between "injected" (synthetic) and regular (human) input on Linux (see https://github.com/moses-palmer/pynput/issues/105#issuecomment-412435532). While we do not yet make use of this functionality, we do have plans to in the near future.
That said, we would welcome a Pull Request to add installation instructions for Linux! The relevant repo is at https://github.com/OpenAdaptAI/OpenAdapt.web. Of course, this would require testing the core library on Linux as well.
but you should consider that 96% of the developers use linux who ultimately are going to use openadapt and not your average everyday Karens. LOL also i believe you response is AI generated and not any person is responsible for your response.
The strength of this project's approach seems to be that it uses SAM and multimodal models to visually parse GUI layouts, instead of relying on OS-specific features like window's accessibility api. Every month I check again to see if I can use it on my OS yet. The feature that I'm really excited about is just having a way to parse a whole-screen screenshot into something that can be described in detail by an LLM model. The automation/interactive parts of the project aren't necessary for me. I just want a super powerful OCR-like tool that works on whole-screen screenshots to give structured output like text, window, buttons and other input field locations.
@metatrot I have tried building something very similar but at very initial stage : https://github.com/microsoft/graphrag
@metatrot thank you for the information!
I just want a super powerful OCR-like tool that works on whole-screen screenshots to give structured output like text, window, buttons and other input field locations.
Can you please clarify what this would be useful for?
@hemangjoshi37a I believe you pasted the wrong link
@abrichr I would use it for the same purposes as this: https://github.com/louis030195/screen-pipe (sadly that project is still mac-only at the moment)
Feature request
please add install instructions for linux on the
https://openadapt.ai/#start
pageMotivation
please add install instructions for linux on the
https://openadapt.ai/#start
page