jcjohnson / neural-style

Torch implementation of neural style algorithm
MIT License
18.31k stars 2.71k forks source link

Segmentation Fault (Jetson Tegra TX2) #418

Open vitcozzolino opened 6 years ago

vitcozzolino commented 6 years ago

I'm on a Jetson Tegra TX2 (64 bit Ubuntu, 6 cores, 8 GB RAM, GPU) and I'm unable to run any example, I always receive a Segmentation Fault error. For example:

th neural_style.lua -print_iter 1 -gpu 0
Successfully loaded models/VGG_ILSVRC_19_layers.caffemodel
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
fc6: 1 1 25088 4096
fc7: 1 1 4096 4096
fc8: 1 1 4096 1000
Segmentation fault (core dumped)
th neural_style.lua -print_iter 1 -gpu -1
Successfully loaded models/VGG_ILSVRC_19_layers.caffemodel
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
fc6: 1 1 25088 4096
fc7: 1 1 4096 4096
fc8: 1 1 4096 1000
Segmentation fault (core dumped)
htoyryla commented 6 years ago

There was a similar problem on Jetson TK1 as well as various 32-bit linux installations. In those cases, the solution was to compile Torch to use Lua 5.2 instead of LuaJIT. You say you have 64-bit Ubuntu, so this might not apply to you, but one the other hand it might. See this thread, especially the last comment about using Lua 5.2 https://github.com/jcjohnson/neural-style/issues/193

vitcozzolino commented 6 years ago

I've tried what you suggested but now things are even worse.

 th neural_style.lua -print_iter 1 -gpu -1
Bus error (core dumped)
th neural_style.lua -print_iter 1 -gpu 0
Bus error (core dumped)
htoyryla commented 6 years ago

No idea why that happens. Somebody else has reported a bus error using torch on TX1, but without any response: https://github.com/torch/torch7/issues/950 .

By reinstalling torch with defaults you should get back where you started. Don't know what is the shortest way to get torch working though. Anyway, this is almost certainly a torch issue (not a neural-style issue), and you might get better help in the torch users group https://groups.google.com/forum/#!forum/torch7

vitcozzolino commented 6 years ago

Thanks much for the help! Will have a look on the torch user group.

htoyryla commented 6 years ago

BTW, after installing torch with LUA5.2, one needs to reinstall all luarocks (such as nn, cutorch, cunn, cudnn), too. Don't know if you were aware of this (applies every time one reinstalls torch). I guess one should not get such a cryptic error because of this, but you never know for sure.