ishan0102 / vimGPT

Browse the web with GPT-4V and Vimium
MIT License
2.63k stars 197 forks source link

Pass in accessibility tree with the prompt #4

Open ishan0102 opened 11 months ago

ishan0102 commented 11 months ago

Chrome creates accessibility trees for webpages.

An accessibility tree is a tree of accessibility objects that assistive technology can query for attributes and properties and perform actions on.

This might defeat the purpose of using visual only information, but it may be beneficial to include this information with the prompt and somehow map the elements to their corresponding Vimium keybindings. Essentially, we want to create a great interface for the model to select objects (Vimium) while pruning for relevant context (accessibility tree).

benfield97 commented 10 months ago

working on this