-
Hey!
First of all thank you for this library!
I would like to take your actors and critics and implement RNN-enhanced TD3 algorithm as described here: https://arxiv.org/pdf/1710.06537.pdf.
I …
-
Hello,
Is there any benefit to having a vanilla REINFORCE algorithm for people trying to learn the concepts? REINFORCE with Baseline includes a value function approximator which has a lot of simila…
-
There are several optimizations to our PPO recipe which could help push it closer to SOTA in terms of performance. There are also several pieces of documentation we could offer alongside this recipe t…
-
I was trying to reproduce your results, however whenever I try to run the script about `simple_speaker_listener`, it crashes for a shape mismatch (Currently trying your last commit, but also on old co…
-
There is the code of reinforce.py
`for action, r in zip(self.saved_actions, rewards):
action.reinforce(r)`
And there is the code of actor-critic.py:
` for (action, value), r in zi…
-
When I read the paper, they say that it works at discrete action space.
Is it also possible at continuous action space???
-
-
Hi Daniel! Thanks for this excellent repo! I enjoy reading this paper too!
Here are a little question on the baseline MASAC in your paper.
In the above equation, you do not provide detail …
-
I suppose SAC algorithm has one actor network and two critic network, now I want to rank the DRL states importance by calculate integrated gradients of each states to sork the states. so I wound if t…
-
Dear Feiyun,
I've been reading your paper,
[Cohesion-based Online Actor-Critic Reinforcement Learning for mHealth Intervention](https://arxiv.org/pdf/1703.10039.pdf),
with much interest. I wo…