Element-Research / rnn

Recurrent Neural Network library for Torch7's nn
BSD 3-Clause "New" or "Revised" License
939 stars 313 forks source link

RAM using cpu more than gpu? #361

Open Lzc6996 opened 7 years ago

Lzc6996 commented 7 years ago

when i run the example of RAM, i find that my gpu memory is alomost free but all of my cpu is running to 99%. Are there some layers in RAM must use cpu so that the speed of the model is a little slow? my gpu is Tesla K40m , and the speed is 140 examples/s with default settings for mnist. so i want to know is there something wrong?

JoostvDoorn commented 7 years ago

Please post the exact piece of code that you are running here, and be as specific as possible with your question. Make sure you call :cuda() on the network to run it on GPU.

Lzc6996 commented 7 years ago

@JoostvDoorn i just run the example code with th examples/recurrent-visual-attention.lua --cuda

Lzc6996 commented 7 years ago

@JoostvDoorn i sorry for that i make a mistake in my problem.but i find another one.when i run recurrent-visual-attention.lua by th examples/recurrent-visual-attention.lua --cuda , my GPU only be occupied almost 20%.looks like | 0 Tesla K40m On | 0000:03:00.0 Off | 0 | | N/A 42C P0 65W / 235W | 681MiB / 11519MiB | 23% Default | +-------------------------------+----------------------+----------------------+ | 1 Tesla K40m On | 0000:04:00.0 Off | 0 | | N/A 43C P0 64W / 235W | 734MiB / 11519MiB | 20% Default | +-------------------------------+----------------------+----------------------+ | 2 Tesla K40m On | 0000:83:00.0 Off | 0 | | N/A 43C P0 66W / 235W | 678MiB / 11519MiB | 23% Default |

i want to know which part is the Speed bottleneck in this code. how can i fix it and make my program faster.