Closed Itsukara closed 7 years ago
Thanks! I'm gonna review to merge.
Python3.5対応
I don't have 3.5 environment now, but your code worked fine with 2.7 env. Thanks.
不要なコード削除
I used to leave these dummy A3CTraningThread instantiating code intentionally in display tool and visualize tool, because checkpoints file contains local network parameters in addition to global network. However now I realized that checkpoints loading works fine even if we only load global network parameter, as you commented out these part. Thanks.
ALEの初期設定、Breakoutのアクション表示を追加
I used to set "frame_skip" and "repeat_action_probability" parameters in "ale.cfg" file, but I'll set these parameters in game_state class by code as you suggest. It will be easier to understand.
pi=0.0になるケースへの対応
I've already applied your fix to master yesterday. I optimized it a bit with unifying two log pi calculating nodes. Thanks.
https://github.com/miyosuda/async_deep_reinforce/commit/41f2d75499accb1b2073daae295603bb6d89f88e
a3c_display.pyでのプレイ画面出力と動画作成スクリプト新規追加
This function seems useful, so I'll add switching option for this function later.
I'll test these commits in development branch and merge them to master laster. Thanks!
Thank you very much for take a time to review and test my change. I'm looking forward your merge.
BTW, I'm trying other changes to speed up training process. I might propose another change if the try is successful. But it will take a time because the training and verification take so much time. (^_^;)
@Itsukara Thanks, I'm looking forward to your proposal.
Hi @miyosuda,
I noticed today that you already merged my pull request. Thank you very much for your kindness to take a time to check and merge my code.
As for "another change", I noticed that you already added in you new code!! That is: game_state.py: self.ale.setInt(b'frame_skip', 4)
So, I have no more proposal now.
As for "another change", I noticed that you already added in you new code!! That is: game_state.py: self.ale.setInt(b'frame_skip', 4)
Ah, I see! Thanks.
Hello @miyosuda,
Your program is great!! I'm playing with deep learning using your code.
In my try of breakout game, I changed some parts of your program. I believe this change is useful for other people. So, I'd like to share it. Could you merge my change to your code?
As for what I changed , could you see commitment message? I extracted and added some comment in following (sorry for intermix of Japanese):
・Python3.5対応 (Python2.7でも問題ないはずです) ・不要なコード削除 (a3c_display.pyやa3c_visualize.pyで不要なthreadを作成していたのでコメンと化) ・対象ゲームをBreakoutに変更(constan.pyのACTION_SIZEも変更。ちなみに定数化は不要では?) ・a3c_display.pyでのプレイ画面出力機能と追加と、動画作成スクリプト新規追加 ・ALEの初期設定、Breakoutのアクション表示を追加(ALEのドキュメント記載のデフォルト値は好ましくないため) ・各種情報の表示機能追加(ゲーム状態を各thread毎に縦の列で追えるようになってます。また性能を表示) ・a3c_visualize.pyのpython3.5対応+不要コード削除 ・pi=0.0になるケースへの対応(別途Issueでお知らせした件です) ・ログ情報からのデータ抽出、表示ツールを追加