An accessibility tree is a tree of accessibility objects that assistive technology can query for attributes and properties and perform actions on.
This might defeat the purpose of using visual only information, but it may be beneficial to include this information with the prompt and somehow map the elements to their corresponding Vimium keybindings. Essentially, we want to create a great interface for the model to select objects (Vimium) while pruning for relevant context (accessibility tree).
Chrome creates accessibility trees for webpages.
This might defeat the purpose of using visual only information, but it may be beneficial to include this information with the prompt and somehow map the elements to their corresponding Vimium keybindings. Essentially, we want to create a great interface for the model to select objects (Vimium) while pruning for relevant context (accessibility tree).