karpathy / neuraltalk2

Efficient Image Captioning code in Torch, runs on GPU
5.49k stars 1.26k forks source link

How can I get convert_checkpoint_gpu_to_cpu.lua to run on OSX ? #177

Closed orgicus closed 7 years ago

orgicus commented 7 years ago

Hi,

I'm trying to use the convert_checkpoint_gpu_to_cpu.lua script but I keep getting into all sort of errors and it's unclear what the best setup is (e.g. torch version, luarocks packages versions, etc.).

I start off this with this:

LD_LIBRARY_PATH=/usr/local/cuda/lib/ luajit convert_checkpoint_gpu_to_cpu.lua ../StackGAN/models/text_encoder/lm_sje_nc4_cub_hybrid_gru18_a1_c512_0.00070_1_10_trainvalids.txt_iter30000_cpu.t7 
luajit: /Users/George/torch-cl/install/share/lua/5.1/cudnn/init.lua:98: attempt to call field 'hasFastHalfInstructions' (a nil value)
stack traceback:
    /Users/George/torch-cl/install/share/lua/5.1/cudnn/init.lua:98: in function 'fasterHalfMathTypeForCurrentDevice'
    /Users/George/torch-cl/install/share/lua/5.1/cudnn/init.lua:112: in function 'configureMath'
    /Users/George/torch-cl/install/share/lua/5.1/cudnn/init.lua:131: in main chunk
    [C]: in function 'require'
    convert_checkpoint_gpu_to_cpu.lua:16: in main chunk
    [C]: at 0x010f3d1ac0

But after a long struggle installing torch (not torch-cl) from scratch, getting cudnn errors:

/Users/George/torch/install/share/lua/5.1/trepl/init.lua:389: /Users/George/torch/install/share/lua/5.1/trepl/init.lua:389: /Users/George/torch/install/share/lua/5.1/cudnn/ffi.lua:1278: 'libcudnn (R4) not found in library path.
Please install CuDNN from https://developer.nvidia.com/cuDNN
Then make sure files named as libcudnn.so.4 or libcudnn.4.dylib are placed in your library load path (for example /usr/local/lib , or manually add a path to LD_LIBRARY_PATH)

stack traceback:
    [C]: in function 'error'
    /Users/George/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require'
    [string "_RESULT={require('cudnn')}"]:1: in main chunk
    [C]: in function 'xpcall'
    /Users/George/torch/install/share/lua/5.1/trepl/init.lua:661: in function 'repl'
    ...orge/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:204: in main chunk
    [C]: at 0x0102b75ad0    

Eventually I bit the bullet, downgraded libcudnn from 5.0 to 4.0, made symbolic links from usr/local/cuda/ to /usr/local and got the cudnn module to work...

but I'm getting a different error now:

uajit convert_checkpoint_gpu_to_cpu.lua ../StackGAN/models/text_encoder/lm_sje_nc4_cub_hybrid_gru18_a1_c512_0.00070_1_10_trainvalids.txt_iter30000_cpu.t7 
luajit: cuda runtime error (35) : CUDA driver version is insufficient for CUDA runtime version at /tmp/luarocks_cutorch-scm-1-8387/cutorch/lib/THC/THCGeneral.c:16
stack traceback:
    [C]: at 0x07a6f580
    [C]: in function 'require'
    ...s/George/torch-cl/install/share/lua/5.1/cutorch/init.lua:2: in main chunk
    [C]: in function 'require'
    convert_checkpoint_gpu_to_cpu.lua:14: in main chunk
    [C]: at 0x01071d8ac0

Has anyone ran into similar issues ? Does this refer to CUDA Toolkit (not cudnn) ? I'm using CUDA Toolkit 7.5 (on OSX 10.11.5) btw, should I go for 8.0...it's unclear.

Any hints,tips on how I could get the gpu to cpu converter working ?

YanShuo1992 commented 7 years ago

I am not sure. But I think CUDA is only for Nvidia GPU. It doesn't work on mac.

orgicus commented 7 years ago

@YanShuo1992 The one I'm using has a nVidia GeForce GT 750M (2GB VRAM) and I've installed CUDA 8.0 (Drivers + SDK) and CuDNN 5.1

YanShuo1992 commented 7 years ago

@orgicus I give up to use OS X. I try it on Ubuntu 16.04 in an virtual machine and I can caption the images now. Sorry my answer is not helpful.

orgicus commented 7 years ago

@YanShuo1992 No problem, the suggestion is great. I was thinking a web instance running ubuntu, but the virtual machine, just for conversions makes more sense.

I've reinstalled/updated torch/cutorch/nn/cunn/cudnn and now I'm getting a different error: THCStorage.cu line=66 error=63 : OS call failed or operation not supported on this OS Doing a quick search I see some a potential solution would be to upgrade from OSX 10.11 to 10.12, but with the libraries on the system, I'm paranoid about breaking them :)

The virtual machine option sounds simpler, thank you!

rohanrc1997 commented 5 years ago

How did you configure the Virtual Machine to use the Nvidia GPU. Or, is it just possible to install the CUDA drivers + toolkit without having an actual GPU on-board, so as to run the CUDNN/CUNN libraries of torch ?