-
Hi, I am really stuck at the discount_rewards function. Can you explain the logic behind discount_rewards function. It seems its updating the rewards in forward direction
-
Google Next Extended Seoul 2018 | Festa!
https://festa.io/events/104
[Google Next Extended Seoul 행사는 빅데이터, 머신러닝, 데이터베이스, 서버리스 등 다양한 Google Cloud Platform 활용 사례를 공유하고 배우는 자리입니다. ]
...그렇다고 합니다. 이번 주말…
-
## 一言でいうと
強化学習で大規模な分散学習を行う研究。A3Cでは各エージェントは勾配を中央サーバーに送るが、提案手法(IMPALA)では経験(状態/行動/報酬)をそのまま中央(Learner)に送りそこで学習する。よって末端エージェントはoff-policy学習となるが、各経験に重要度をふるためのV-traceという手法を提案している
![image](https://user-i…
-
Hi,
Will the benchmark code of four navigation algorithms in the paper be released?
Also, how long does it take to train the agents? My english is poor, so I have some confusion about the followin…
-
I'm playing around with reinforcement learning, and in this case, loss value should only be calculated for one output, and be zero for the rest. It's not much of a problem in most cases, as desired_ou…
-
I ran the script last night. It started with ~11 mean reward, and ended with ~15.5 mean reward.
I tried to play this mini-game myself, and I could get ~100 score or more.
Deepmind reached ~100 score…
-
## 🐛 Bug
I am trying to run a small neural network on the CPU and am finding that the memory used by my script increases without limit. Since my script does not do much besides call the network, th…
-
![屏幕截图 2024-06-03 112232](https://github.com/GeminiLight/hrl-acra/assets/147413930/ad2dd4b2-b7cb-4dbd-9da5-45782bede298)
After I trained the model and ran several rounds of tests, I found that th…
-
Hello. I have been trying to train an agent in `HumanoidBulletEnv-v0`.
I have tried using multiple frameworks and algorithms, but have not been able to obtain a good policy in this particular environ…
-
I need to get a copy of `shared` neural network of type `torch::nn::Sequential`. It seems that there is no available API for this purpose at the moment. It seems that declaring and instantiating the n…