jingweiz / pytorch-dnc

Neural Turing Machine (NTM) & Differentiable Neural Computer (DNC) with pytorch & visdom
MIT License
278 stars 52 forks source link

Simplest recommended way to add new envs #2

Closed AjayTalati closed 7 years ago

AjayTalati commented 7 years ago

Hi Jingweiz,

this a great implementation of the DNC, thanks a lot for sharing it. I was wondering what's the recommended/simplest way to add new environments, (from OpenAI)? I guess it would just mean adding them to ./utils/factory.py?

There's an interesting paper I've been reading, Bridging the Gap Between Value and Policy Based Reinforcement Learning, which compares some A3C like algorithms on the six OpenAI algorithmic tasks.

Seems like it would be a nice test for the DNC to train it using A3C on these tasks - I'm interested in both the extra benefit of external memory, and also different ways of training the DNC?

jingweiz commented 7 years ago

Hey, I'm actually currently doing a project using a3c and ntm, if you want to combine a3c and dnc, I suggest just use the a3c from pytorch-rl cos these two repos are under the same code structure. Basically, just replace the Model class there in pytorch-rl with the Circuit class here then that should already work, of course you should pay attention to keep the namings consistent. To implement a new Env, like from OpenAI, you can also refer the to the inherited Envs in pytorch-rl in ./core/env.py, where there're already classes for openai. If you want to have your own new Env, just write an inherited class of Env then register it in factory.py. If there are further questions please let me know! Also curious to see those two work together:)

AjayTalati commented 7 years ago

Hey @jingweiz, thanks very much for your helpful reply. I'm going to try what you suggest this weekend.

On another note, I'm working on the intrinsic curiosity paper with kim, using his repo, https://github.com/kimhc6028/pytorch-noreward-rl/blob/master/main.py. We haven't got any good results yet though as Doom, Atari and Mario all take rather a long time to train with A3C. The intrinsic curiosity module is fairly stand-alone though, and I think it can be added to other agents, in particular your Q learners?

I was wondering if you were interested in adding this to your pytorch-rl code base?

At the moment, it seems the simplest environment to test the curiosity module on are large gridworlds, in particular I've ported over the gridworld from this TensorFlow implementation, and am experimenting on it with the curiosity module, and a base A3C agent. It does'nt seem to be successful though?

So I was wondering how to implement this environment in pytorch-rl? I guess it requires a new 'gridworld' class in env.py? It shares a similar API to gym environments, so I guess modifying, class GymEnv(Env), would be a good thing to do? Arthur Juliani, the author of the GridWorld code, also has some nice bandit environments in another of his repos, Meta-RL, they're nice as they train quite fast.

Thanks a lot for your advice, and thank's once again for implementing these great code bases :+1:

jingweiz commented 7 years ago

Hey, Thanks a lot for the comments! That paper is on my list but currently I'm quite busy so in the near several weeks I don't think I would have time. I've trained on one of my own grid worlds simulators and they work for me, maybe there might be something wrong with how you set the reward?