normal-computing / fuji-web

Fuji is an AI agent that lives in your browser's sidepanel. You can now get tasks done online with a single command!
Apache License 2.0
193 stars 13 forks source link

Support scrolling portion of web page #113

Open mondaychen opened 2 months ago

mondaychen commented 2 months ago

As of now WebWand doesn't scroll well when the scrollable portion is in part of the page (e.g. a dialog with long content) instead of the whole body of page

One thought is we can provide all scrollable elements on the page and ask for an ID when agent wants to scroll. But it might be tricky to tell the agent what those wrappers are, because unlike buttons and inputs they are often just

with CSS.

Maybe we need have a subagent just for this. In that case we can send a separate screenshot to indicate where are each scrollable portion.