-
https://arxiv.org/abs/1802.06070
# Abstract
- Learn **skills** by maximizing information using maximum entropy policy
- Train typical reinforcement learning with best **skill** after unsupervised…
-
Hi there,
Thanks for sharing your repo, it's helping me greatly explore the field. I have a question I'm not sure the answer of. In this implementation I believe you have implemented the V(s) funct…
-
Hi there! I'm trying to reproduce your code and find a small issue in the off-line training setup. Hope it's helpful.
PYTHONPATH=./ python3 ./_sim_script_example_/ka.py instead of PYTHONPATH=./ pytho…
-
I have referred to some people's work on adding RNNs to reinforcement learning algorithms, but strangely, almost everyone's code implementation is different. So I would like to ask how you integrate L…
-
Hello,
I had a quick question about the form of the value function. Right now by default it is an action value function with a linear layer that receives the output of the decoder. I was wondering …
-
Hi! I'm trying to implement DDPG as well based on paper [Continuous control with deep reinforcement learning](http://arxiv.org/pdf/1509.02971.pdf). Though without much success yet... So I was looking …
-
Hey VinF,
thanks for your work!
I have questions about the DDPG implementation in deer.
Patrick Emami recommends in http://pemami4911.github.io/blog/2016/08/21/ddpg-rl.html to use for the act…
-
In PPO algorithm mentioned here [https://arxiv.org/pdf/1707.06347.pdf]
it has (state , action, reward) tuple and [s1,a1,r1,s2,a2,r2....sn,an,rn] as experience to train
actor model and critic mode…
-
### Issue Checklist
- [X] I am using NexT version 8.0 or later.
- [X] I have already read the [Troubleshooting page of Hexo](https://hexo.io/docs/troubleshooting) and [Troubleshooting page of NexT…
-
## Motivation
### 1. Consistent style for `torch.nn.modules.loss.*Loss`
In `torch.nn.modules.loss`, there are many `*Loss` subclassing `nn.Module`. The `Loss.__init__()` does not takes other `nn…