Closed AjayTalati closed 7 years ago
Hey, I'm actually currently doing a project using a3c and ntm, if you want to combine a3c and dnc, I suggest just use the a3c from pytorch-rl cos these two repos are under the same code structure. Basically, just replace the Model
class there in pytorch-rl with the Circuit
class here then that should already work, of course you should pay attention to keep the namings consistent.
To implement a new Env
, like from OpenAI, you can also refer the to the inherited Envs in pytorch-rl in ./core/env.py, where there're already classes for openai. If you want to have your own new Env, just write an inherited class of Env
then register it in factory.py
.
If there are further questions please let me know! Also curious to see those two work together:)
Hey @jingweiz, thanks very much for your helpful reply. I'm going to try what you suggest this weekend.
On another note, I'm working on the intrinsic curiosity paper with kim, using his repo, https://github.com/kimhc6028/pytorch-noreward-rl/blob/master/main.py. We haven't got any good results yet though as Doom, Atari and Mario all take rather a long time to train with A3C. The intrinsic curiosity module is fairly stand-alone though, and I think it can be added to other agents, in particular your Q learners?
I was wondering if you were interested in adding this to your pytorch-rl code base?
At the moment, it seems the simplest environment to test the curiosity module on are large gridworlds, in particular I've ported over the gridworld from this TensorFlow implementation, and am experimenting on it with the curiosity module, and a base A3C agent. It does'nt seem to be successful though?
So I was wondering how to implement this environment in pytorch-rl? I guess it requires a new 'gridworld' class in env.py? It shares a similar API to gym environments, so I guess modifying, class GymEnv(Env), would be a good thing to do? Arthur Juliani, the author of the GridWorld code, also has some nice bandit environments in another of his repos, Meta-RL, they're nice as they train quite fast.
Thanks a lot for your advice, and thank's once again for implementing these great code bases :+1:
Hey, Thanks a lot for the comments! That paper is on my list but currently I'm quite busy so in the near several weeks I don't think I would have time. I've trained on one of my own grid worlds simulators and they work for me, maybe there might be something wrong with how you set the reward?
Hi Jingweiz,
this a great implementation of the
DNC
, thanks a lot for sharing it. I was wondering what's the recommended/simplest way to add new environments, (fromOpenAI
)? I guess it would just mean adding them to./utils/factory.py
?There's an interesting paper I've been reading, Bridging the Gap Between Value and Policy Based Reinforcement Learning, which compares some
A3C
like algorithms on the six OpenAI algorithmic tasks.Seems like it would be a nice test for the
DNC
to train it usingA3C
on these tasks - I'm interested in both the extra benefit of external memory, and also different ways of training theDNC
?