DmitryUlyanov / texture_nets

Code for "Texture Networks: Feed-forward Synthesis of Textures and Stylized Images" paper.
Apache License 2.0
1.22k stars 218 forks source link

invalid device ordinal #74

Open wq409813230 opened 7 years ago

wq409813230 commented 7 years ago

I have installed all dependencies that this repository need,but something gose wrong when running the command bellow:

th test.lua -input_image /data/artwork/content/huaban.jpeg -model_t7 data/checkpoints/model.t7 -gpu 0

THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-3022/cutorch/init.c line=734 error=10 : invalid device ordinal /root/AI/torch/install/bin/luajit: test.lua:26: cuda runtime error (10) : invalid device ordinal at /tmp/luarocks_cutorch-scm-1-3022/cutorch/init.c:734 stack traceback: [C]: in function 'setDevice' test.lua:26: in main chunk [C]: in function 'dofile' ...t/AI/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00406670 ` bellow is my GPU info

`+-----------------------------------------------------------------------------+ | NVIDIA-SMI 375.26 Driver Version: 375.26 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Quadro P5000 Off | 0000:03:00.0 On | Off | | 26% 39C P8 8W / 180W | 110MiB / 16264MiB | 0% Default | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 1387 G /usr/lib/xorg/Xorg 108MiB | +-----------------------------------------------------------------------------+ ` I really have no idea where the problem is.

DmitryUlyanov commented 7 years ago

Hi, I think torch enumerates GPU' from 1. If you have only one GPU you can omit this argument.

On Thu, 22 Jun 2017, 05:58 吴强, notifications@github.com wrote:

I have installed all dependencies that this repository need,but something gose wrong when running the command bellow:

th test.lua -input_image /data/artwork/content/huaban.jpeg -model_t7 data/checkpoints/model.t7 -gpu 0

THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-3022/cutorch/init.c line=734 error=10 : invalid device ordinal /root/AI/torch/install/bin/luajit: test.lua:26: cuda runtime error (10) : invalid device ordinal at /tmp/luarocks_cutorch-scm-1-3022/cutorch/init.c:734 stack traceback: [C]: in function 'setDevice' test.lua:26: in main chunk [C]: in function 'dofile' ...t/AI/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00406670 ` bellow id my GPU info

`+-----------------------------------------------------------------------------+ | NVIDIA-SMI 375.26 Driver Version: 375.26 |

|-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |

|===============================+======================+======================| | 0 Quadro P5000 Off | 0000:03:00.0 On | Off | | 26% 39C P8 8W / 180W | 110MiB / 16264MiB | 0% Default |

+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage |

|=============================================================================| | 0 1387 G /usr/lib/xorg/Xorg 108MiB |

+-----------------------------------------------------------------------------+ ` I really have no idea where the problem is.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/DmitryUlyanov/texture_nets/issues/74, or mute the thread https://github.com/notifications/unsubscribe-auth/AGanZC2NPnLy6b5NHRpoNgBQuuVvemmpks5sGdhIgaJpZM4OBwLe .

-- Best, Dmitry

wq409813230 commented 7 years ago

Hi,Dear Dmitry,thank you for your reply.but it still failed when I ignore the -gpu argument.what makes me confused is that the chainer-fast-neuralstyle implemented with python also has the '-gpu' argument, and it runs well when I set -gpu 0. qq 20170622160836

engahmed1190 commented 6 years ago

Hi , This issue still persist any one found a solution for it

gxlcliqi commented 6 years ago

the gpu index starts from 1, pls try to use option -gpu 1 instead of -gpu 0

psenough commented 5 years ago

i also get this error. whatever gpu id i input. cudnn works fine on chainer.

my setup info: ubuntu 16.04 torch7 cuda9.2 cudnn7.1.4 `` +-----------------------------------------------------------------------------+ | NVIDIA-SMI 396.26 Driver Version: 396.26 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 970 Off | 00000000:01:00.0 On | N/A | | 0% 46C P8 17W / 163W | 455MiB / 4040MiB | 1% Default | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 958 G /usr/lib/xorg/Xorg 287MiB | | 0 1897 G compiz 164MiB | +-----------------------------------------------------------------------------+ ``

i think it might be because of torch7 being by default for cudnn r5 ?!

i had to run git clone https://github.com/soumith/cudnn.torch.git -b R7 && cd cudnn.torch && luarocks make cudnn-scm-1.rockspec to get cudnn7 recognized by torch. and had to re-do luarocks install cunn and luarocks install cutorch after that, but now get this same "invalid device ordinal" error.

maybe it's having some sort of version mismatch of cudnn cunn and cutorch? don't know where the cunn.torch and cutorch.torch compliant with cudnn.torch R7 might be located. anyone has any clue?

i'm not used to ubuntu and lua :S

psenough commented 5 years ago

found https://github.com/torch/cutorch/issues and yeah, doesn't look like they support cuda 9 yet, that's probably the issue here i think. :/ if anyone has any other insights beyond "try downgrading", i'd appreciate the input.