-
http://twitter.com/kevingo/status/940942203550969856
-
-
Asynchronous parallel training like A3C is supported by ChainerRL, but synchronous parallel training, where multiple actors interact with their own environments in a synchronous manner, is not support…
-
对于奖励函数的设定是不是有什么要求啊?在A3C算法中使用的是状态值而不是动作值,那么奖励函数中的是不是要跟状态变量直接相关?而且还有个很迷奇的问题,为什么在相同权重的条件下,两次运行的结果差别很大?目前我的累积奖励值虽然有收敛趋势,但是波动还是很大!
![total_reward](https://user-images.githubusercontent.com/68805707/884750…
-
Hello !
I wrote my own A3C with LSTM, but it was not perfect. When I trained the model with batches it doesn't train, but when I used all episode experiances, it was perfect (only feeded LSTM state o…
-
I run the Cart_Pole.py with A3C&A2C on windows and got the error.
Traceback (most recent call last):
File "D:/学习/Deep-Reinforcement-Learning-Algorithms-with-PyTorch-master/results/Cart_Pole.py",…
-
1. 설치 및 종속성 문제 명시적으로 해결하기
1-1. 우리가 어떤 라이브러리, 어떤 프레임워크 사용해서 어떻게 진행했는지 이야기
1-2. 설계의 독창성을 입증해야함. 완전한 구현체를 가져온 것이 아니라, 이미 잘 짜여진 라이브러리의 함수만 빌려온 것이 되어야 함.
2. 적용 이론 및 수식 확실히 하기.
2-1. 현재 다음 서적을 참고하고 있음:…
-
Thank you for the easy to use and fast A3C implementation. I created a simple problem for rapid testing that rewards 0 on all steps except the terminal step, where it rewards either -1 or 1. GA3C cann…
-
I run the Cart_Pole.py and got the error.
-
Hi Kosuke,
I've tried your model on breakout game. The performance was amazing, the average score went up to 520 after 80M steps. It's far more better than any other model I've tried.
But the …