How to use GPU? - Githubissues

The script run_cpu runs well to learn the game policy. Normally, it should be better to use GPU for learning, so, I changed the gpu option in the file 'run_cpu': gpu = 0 It seems that this option does not work. Is there a way to use GPU for the learning process, please?

Below is what I have tried.

Setting gpu=0 produced only a error as follows.

... state dim multiplier 1
/Users/yndk/torch/install/bin/luajit: /Users/yndk/torch/install/share/lua/5.1/torch/Tensor.lua:238: attempt to index a nil value stack traceback: /Users/yndk/torch/install/share/lua/5.1/torch/Tensor.lua:238: in function 'type' /Users/yndk/torch/install/share/lua/5.1/nn/utils.lua:52: in function 'recursiveType' /Users/yndk/torch/install/share/lua/5.1/nn/Module.lua:126: in function 'type' /Users/yndk/torch/install/share/lua/5.1/nn/utils.lua:45: in function 'recursiveType' /Users/yndk/torch/install/share/lua/5.1/nn/utils.lua:41: in function 'recursiveType' /Users/yndk/torch/install/share/lua/5.1/nn/Module.lua:126: in function 'type' /Users/yndk/torch/install/share/lua/5.1/nn/utils.lua:45: in function 'recursiveType' /Users/yndk/torch/install/share/lua/5.1/nn/utils.lua:41: in function 'recursiveType' /Users/yndk/torch/install/share/lua/5.1/nn/Module.lua:126: in function 'cuda' ./lstm_embedding.lua:149: in function 'network' ./NeuralQLearner.lua:90: in function '__init' /Users/yndk/torch/install/share/lua/5.1/torch/init.lua:91: in function </Users/yndk/torch/install/share/lua/5.1/torch/init.lua:87> [C]: at 0x0a6b3250 agent.lua:158: in main chunk [C]: in function 'dofile' ...yndk/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk

[C]: at 0x010a3a9bd0

So, I edited the file lstm_embedding.lua to import cunn: require 'cunn'
* Now, $ run_cpu 1 went through the first learning loop. But, as soon as the second loop started, it produced error messages:

Network weight sum: 131.26902770996 Saved: logs/run1/DQN.t7
/Users/yndk/torch/install/bin/luajit: ./NeuralQLearner.lua:255: attempt to call method 'cuda' (a nil value)p: 17ms
stack traceback: ./NeuralQLearner.lua:255: in function 'getQUpdate' ./NeuralQLearner.lua:268: in function 'qLearnMinibatch' ./NeuralQLearner.lua:412: in function 'perceive' agent.lua:200: in main chunk [C]: in function 'dofile' ...yndk/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk

[C]: at 0x010f36ebd0

Now, again I edited the file NeuralQLearner.lua to import cunn: require 'cunn' But the error message came all the same way. I don't know what to do now.

We haven't added functional GPU support to the final version of the code, but I remember that in initial experiments, the GPU-based code didn't give any significant speed-ups since the main bottleneck lies in the recurrent (LSTM) modules. However, this might not be the case anymore with recent improvements to the libraries.

The error you report is because the call targets:cuda() is invalid since targets is a table and not a torch tensor. You can fix this by replacing line 255 in NeuralQLearner.lua with if self.gpu >= 0 then targets = {targets[1]:cuda(), targets[2]:cuda()} end.

There might be other places that require similar changes. If you end up fixing them, please consider submitting a pull request!

karthikncode / text-world-player

How to use GPU? #4

Setting gpu=0 produced only a error as follows.

[C]: at 0x010a3a9bd0

* Now, $ run_cpu 1 went through the first learning loop. But, as soon as the second loop started, it produced error messages:

[C]: at 0x010f36ebd0