World of Bits: An Open-Domain Platform for Web-Based Agents

bishopfunc commented 1 year ago

Summary

どんなもの？（Abstract,Conclusion）エージェントがキーボードやマウス操作によってインターネット上のタスクを実行する強化学習環境, World of Bits (WoB)を開発した。WoBの主な課題は2つある。(i) ウェブベースのタスクを要約、整理すること、(ii) 報酬構造があり、ウェブの移り変わりにもかかわらず再現可能であること。HTTPトラフィックをキャッシュすることでオフラインでタスクを近似できる。ウェブベースのタスクを作成するための3つの手法 MiniWoB, FormWoB, QAWoBを説明した。標準的な教師付き学習や強化学習の手法で十分な結果を得ることはできたが、人間とのギャップはまだ大きい。
先行研究と比べてどこがすごいの？（先行研究,どんな問題を解決した）アルゴリズムだけじゃなくてシミュレーション環境も重要。シミュレーションには制限がある、ロボット工学だと物理的なハードウェアがあり、データ収集や反復が制限される。pixel + keyboard/mouseの初めてのアプローチ。Web系、言語操作系に関連ある。
- first to tackle the problem of interacting with websites using both vision and raw mouse and keyboard actions on open-domain tasks at scale
- a bridge between this semantic-oriented work and the more control-oriented tasks found in most reinforcement learning environments.
技術や手法の"キモ"はどこにある？(新規点,どう解決した)
- Open-domain 既存のインターネットの環境を利用できる
- Open-source 変化に対応しやすい
- Easy to collect data 人間と同じインテーフェースを使うからクラウドソーシングでデータを大量に収集できる
どうやって有効だと検証した？（実験手法） Random, SL, SL+RLを比較した十分な結果を得ることはできたが、人間とのギャップはまだ大きい。
議論はあるか？（未解決点,応用例）
次に読むべき論文？
（任意）より詳しい手法の理解

論文情報・リンク

Shi, T., Karpathy, A., Fan, L., Hernandez, J. & Liang, P.. (2017). World of Bits: An Open-Domain Platform for Web-Based Agents. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:3135-3144 Available from https://proceedings.mlr.press/v70/shi17a.html.
Google Scholar 被引用数:
国際会議:

bishopfunc commented 1 year ago

Docker の中に Browser Gym を通して操作 step t

State: pixel, DOM, r

生のピクセル W,H,C
DOM (x, y, w, h) 4-tuple
報酬 r

Action

(mx, my)
マウス、キーボボードは多項分布
マウスはno-op, click, drag, scroll-up, scroll-downの4種類

Architecture

画像をCNNで処理
queryとDOMの特徴量マップを作成
2つの特徴量マップを結合
LocalCNN
- 局所的な特徴で学習できる
- Soft attention でカーソルがどこに注意するを知る
GlobalCNN
- 特徴量マップを平坦化し、全結合層にいれる

Behavior cloning

教師あり

Reinforcement learning.

A3C, Generalized Advantage Estimation
個別のfine tuingが必要
パラメータは論文で与えられてるけど、実際の実装はページごとに結構違うかも

bishopfunc commented 1 year ago

Minimalistic Web Tasks: MiniWoB

Atariみたいなやつ 100個のタスク

an HTML page that is 210 pixels high, 160 pixels wide the top 50 pixels (in yellow background) contain the natural language task description (randomly generated) and the 160 × 160 area below is for interactions

−1.0 (failure) to 1.0 (success)

bishopfunc commented 1 year ago

Live Web Tasks: FormWoB あるweb siteを使う

オフライン近似のため, プロキシを使って人間によるデモの全てのHTTP request と responseを記録する key, value のペアで報酬関数を定義するデモない request (=cach miss)をしたとき、エピソードは終了する実環境の場合は小さい報酬を返す

FormWoB benchmark United, Alaska, AA, and JetBlue の4つの飛行機予約サイトに適応したフォーム記入,送信ボタン

bishopfunc commented 1 year ago

Crowdsourcing Web Tasks at Scale: QAWoB

報酬関数を設計せずに、自然言語でタスクを生成する Stage 1 query 質問文を作成キーワードを複数パターン

Stage 2 queryに対してデモをする一旦DOMをクリックしたら報酬を返す

制約

queryは理解しやすいもの
e mobile-friendly モバイルでデモするからレスポンシブ対応サイト

QAWoB benchmark. 521 query テンプレート, 2 スロット, 10-100集めた 13,550 total queries

GUI 操作 100 テンプレート 7種類のGUI操作

bishopfunc commented 1 year ago

hajisho / world_model2022_group22

World of Bits: An Open-Domain Platform for Web-Based Agents #1

Summary

論文情報・リンク