Closed johnyboyoh closed 9 years ago
Right, sorry, I am such a clumsy beginner. As you can see, code is still torch.Tensor, when it should be cuda tensor, since we want this to run on gpu. I had used a dirty fix for that, i.e. added in optim/fista.lua, at line 55
if (xinit:type() == 'torch.CudaTensor') then
params.xkm = params.xkm:cuda()
params.y = params.y:cuda()
params.ply = params.ply:cuda()
end
I had forgotten about this. I will try to think of better ways to do this, that don't require changing other packages, and update my code. Thanks for reporting this. Also, in your test code, you need to move the model in gpu memory
module:cuda()
and convert all tensors to cuda tensors, i.e. inputs and targets.
@viorik Hi, thank you for this reply. Unfortunately, the command: module:cuda() yields the following error:
/home/torch/install/share/lua/5.1/torch/Tensor.lua:241: attempt to index a nil value stack traceback: /home/torch/install/share/lua/5.1/torch/Tensor.lua: in function 'type' /home/torch/install/share/lua/5.1/nn/utils.lua:52: in function 'recursiveType' /home/torch/install/share/lua/5.1/nn/Module.lua:123: in function 'type' /home/torch/install/share/lua/5.1/nn/utils.lua:45: in function 'recursiveType' /home/torch/install/share/lua/5.1/nn/Module.lua:123: in function 'cuda' train-autoencoder-mnist.lua:224: in main chunk
I must be missing something....Maybe you can post a simple test code that works for you ?
Uploaded a demo script in demo_conv_psd_gpu. Let me know what the error is now. Try both ways of moving module to gpu, i.e. the commented one as well, where you move each part individually.
@viorik your example 'demo_conv_psd_gpu' is working well both ways (after commenting out line 67) I will now use it as a reference to dig further. Thank you very much for being so helpful! :+1:
Cool, glad to hear that. Small tip: when you use mini-batch training, keep the batch size small (e.g. 5), you will get nice looking filters faster.
Hi, I am trying to run your demo aswell. However with the interpreter th, I get an error that 'qt' was not found. The error happens when using the functions from the image module.
And when I use qlua instead, I get another error when using function "dataset = getdata(filename, params.inputsize)" in line 87
How are you executing the file/which interpreter are you using?
@Richi91, use qlua to be able to display things (image samples and learnt filters). With th that won't be possible. To load data, you need an extra file that I didn't have in my repo since it is the same from the original unsup package, but I've uploaded it now in my repo too. Check autoencoder-data.lua
qt does not work in th. on line 49 set '-display' to false and there should be no problem running on th. BTW, I started using ZeroBrane, which is an IDE for lua. Working quite well IMO.
Thx for your answers! I have that other file also. The error occurs inside the function getdata(), when calling readObject (line 5). "read error: read 1 blocks instead of 15680000"
I guess I will figure out what caused the problem, just wanted to know if it works when you simply execute the script like "qlua demo_conv_psd_gpu.lua" in the shell? I tried with both eclipse LDT (qlua interpreter) and qlua directly in the shell.
I suspect that this may be related to having the same file open or used by several shells. try using os.execute('clear') prior to loading the file.
@Richi91, I've updated now autoencoder-data.lua. Since you don't have the dataset I imagine, you'll first need to wget to get it.
@Richi91 also make sure lines 58-63 are not commented.
I gave it a go now using itorch notebook. It works OK also with qt, so you may want to consider this relatively convenient environment as well. I had to add in line 52: arg = "" to avoid an error on line 53 using dofile("demo_conv_psd_gpu.lua")
The following 3 lines produces the error below.
filename = '/home/richi-ubuntu/workspace/test/src/tr-berkeley-N5K-M56x56-lcn.ascii'
file = torch.DiskFile(filename,'r')
data=file:readObject()
qlua: /usr/local/share/lua/5.1/torch/File.lua:270: read error: read 1 blocks instead of 15680000 at /tmp/luarocks_torch-scm-1-1427/torch7/lib/TH/THDiskFile.c:310
The same file in .bin format obtained from http://cs.nyu.edu/~koray/publis/code/tr-berkeley-N5K-M56x56-lcn.bin produces the same error (also with iTorch). I tried another file, e.g. "http://torch7.s3-website-us-east-1.amazonaws.com/data/housenumbers/train_32x32.t7" and it works...
It's really strange that I can read all files with TH but not with qlua.
don't bother with this anymore, this bug is not related to your unsupgpu module. Anyway, thanks for your help!
I was able to solve my issue by using
local data = torch.DiskFile(datafile,'r'):binary():readObject()
instead of
local data = torch.DiskFile(datafile,'r'):readObject()
@Richi91 good to know. Thanks.
@viorik Hi, I am now getting the following error:
/home/torch/install/share/lua/5.1/unsupgpu/FistaL1.lua:79: attempt to call method 'shrinkagegpu' (a nil value) stack traceback: /home/torch/install/share/lua/5.1/unsupgpu/FistaL1.lua: in function 'pl' /home/torch/install/share/lua/5.1/optim/fista.lua:95: in function 'FistaLS' /home/torch/install/share/lua/5.1/unsupgpu/FistaL1.lua:119: in function 'updateOutput' /home/torch/install/share/lua/5.1/unsupgpu/psd.lua:52: in function 'updateOutput' train-autoencoder-mnist.lua:274: in main chunk
It appears that this cuda command is not recognized. BTW, the input to code:shrinkagegpu(self.lambda/L) in this case is: code - torch.DoubleTensor of size 16x32x32 self.lambda - 1 L - 0.1