Flood-Fill Q-Learning Updates for Learning Redundant Policies in Order to Interact with a Computer Screen by Clicking

https://www.semanticscholar.org/paper/Flood-Fill-Q-Learning-Updates-for-Learning-Policies-Preez-Wilkinson-Gallagher/0d96a54f133e7624d1ea35ec1623275386fbb6f1 概要我々は、コンピュータの画面上をクリックするエージェントを訓練する問題に対するQ-learningの特殊化を提案する。この問題では、エージェントは画面のピクセルを入力とし、ピクセルを出力として選択する。クリックする画素を選択するタスクは、多くのアクションが強化学習の状態遷移の観点から完全に等価である大きな離散アクション空間からアクションを選択することを含んでいる。我々は、等価な行動に対して同時にQ-learningの更新を行うことで、これを利用することを提案する。我々は、アクション（画素）の等価性を決定するために、入力画像にFlood-Fillill（洪水充填）アルゴリズムを使用する。

hajisho / world_model2022_group22

Flood-Fill Q-Learning Updates for Learning Redundant Policies in Order to Interact with a Computer Screen by Clicking #28