torch / torch7

http://torch.ch
Other
8.97k stars 2.38k forks source link

no helpful error messages to me #1097

Open MartinHahner opened 6 years ago

MartinHahner commented 6 years ago

Hi,

I want to use torch7 within DIGITS on my linus machine, but it is not working at all.

I can start the "trepl" with th without a problem, but when I want to use torch7 in DIGITS, I get the following output log:

2017-10-27 10:39:30 [INFO ] Loading mean tensor from /scratch/fs1/hahnerm/experiments/20171027-103925-8611/mean.jpg file 2017-10-27 10:39:30 [INFO ] Loading label definitions from /scratch/fs1/hahnerm/experiments/20170907-172923-c873/labels.txt file 2017-10-27 10:39:30 [INFO ] found 10 categories 2017-10-27 10:39:30 [INFO ] creating data readers 2017-10-27 10:39:30 [FAIL] ...ameworks/torch/install/share/lua/5.1/threads/threads.lua:183: [thread 2 callback] /home/hahnerm/frameworks/torch/install/share/lua/5.1/pb.lua:213: no file '/home/hahnerm/frameworks/digits/digits/tools/torch/lightningmdb.proto' no file '/home/hahnerm/frameworks/digits/digits/tools/torch/lightningmdb.proto' no file '/home/hahnerm/frameworks/digits/digits/tools/torch/lightningmdb.proto' no file '/home/hahnerm/frameworks/digits/digits/tools/torch/lightningmdb.proto' no file '/home/hahnerm/frameworks/digits/digits/tools/torch/lightningmdb.proto' no file '/home/hahnerm/frameworks/digits/digits/tools/torch/lightningmdb.proto' no file '/home/hahnerm/.luarocks/share/lua/5.1/lightningmdb.proto' no file '/home/hahnerm/frameworks/torch/install/share/lua/5.1/lightningmdb.proto' no file './lightningmdb.proto' no file '/home/hahnerm/frameworks/torch/install/share/luajit-2.1.0-beta1/lightningmdb.proto' no file '/usr/local/share/lua/5.1/lightningmdb.proto' stack traceback: [C]: in function 'assert' /home/hahnerm/frameworks/torch/install/share/lua/5.1/pb.lua:213: in function 'require' /home/hahnerm/frameworks/torch/install/share/lua/5.1/pb.lua:254: in function 'searcher' ...e/hahnerm/frameworks/digits/digits/tools/torch/utils.lua:203: in function 'isModuleAvailable' ...e/hahnerm/frameworks/digits/digits/tools/torch/utils.lua:215: in function 'check_require' /home/hahnerm/frameworks/digits/digits/tools/torch/data.lua:378: in function 'new' /home/hahnerm/frameworks/digits/digits/tools/torch/data.lua:728: in function </home/hahnerm/frameworks/digits/digits/tools/torch/data.lua:723> [C]: in function 'xpcall' ...ameworks/torch/install/share/lua/5.1/threads/threads.lua:234: in function 'callback' ...frameworks/torch/install/share/lua/5.1/threads/queue.lua:65: in function <...frameworks/torch/install/share/lua/5.1/threads/queue.lua:41> [C]: in function 'pcall' ...frameworks/torch/install/share/lua/5.1/threads/queue.lua:40: in function 'dojob' [string " local Queue = require 'threads.queue'..."]:13: in main chunk DIGITS Lua Error stack traceback: [C]: in function 'error' ...ameworks/torch/install/share/lua/5.1/threads/threads.lua:183: in function 'dojob' ...ameworks/torch/install/share/lua/5.1/threads/threads.lua:264: in function 'synchronize' ...ameworks/torch/install/share/lua/5.1/threads/threads.lua:142: in function 'specific' ...ameworks/torch/install/share/lua/5.1/threads/threads.lua:125: in function 'Threads' /home/hahnerm/frameworks/digits/digits/tools/torch/data.lua:721: in function 'new' /home/hahnerm/frameworks/digits/digits/tools/torch/main.lua:229: in main chunk [C]: in function 'dofile' ...hahnerm/frameworks/digits/digits/tools/torch/wrapper.lua:25: in function <...hahnerm/frameworks/digits/digits/tools/torch/wrapper.lua:25> [C]: in function 'xpcall' ...hahnerm/frameworks/digits/digits/tools/torch/wrapper.lua:25: in main chunk [C]: in function 'dofile' ...orks/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00406670

So I tried this example to use torch7 directly from the terminal. I installed all dependencies, especially also eladtools with luarocks install https://raw.githubusercontent.com/eladhoffer/eladtools/master/eladtools-scm-1.rockspec

Using https://raw.githubusercontent.com/eladhoffer/eladtools/master/eladtools-scm-1.rockspec... switching to 'build' mode Cloning into 'eladtools'... remote: Counting objects: 20, done. remote: Compressing objects: 100% (20/20), done. remote: Total 20 (delta 0), reused 7 (delta 0), pack-reused 0 Unpacking objects: 100% (20/20), done. cmake -E make_directory build; cd build; cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH="/home/hahnerm/frameworks/torch/install/bin/.." -DCMAKE_INSTALL_PREFIX="/home/hahnerm/frameworks/torch/install/lib/luarocks/rocks/eladtools/scm-1"; make -- The C compiler identification is GNU 4.8.4 -- The CXX compiler identification is GNU 4.8.4 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Found Torch7 in /home/hahnerm/frameworks/torch/install -- Configuring done -- Generating done -- Build files have been written to: /tmp/luarocks_eladtools-scm-1-888/eladtools/build cd build && make install Install the project... -- Install configuration: "Release" -- Installing: /home/hahnerm/frameworks/lmdb/home/hahnerm/frameworks/torch/install/lib/luarocks/rocks/eladtools/scm-1/lua/eladtools/SSU.lua -- Installing: /home/hahnerm/frameworks/lmdb/home/hahnerm/frameworks/torch/install/lib/luarocks/rocks/eladtools/scm-1/lua/eladtools/RecurrentLayer.lua -- Installing: /home/hahnerm/frameworks/lmdb/home/hahnerm/frameworks/torch/install/lib/luarocks/rocks/eladtools/scm-1/lua/eladtools/GlobalDominantPooling.lua -- Installing: /home/hahnerm/frameworks/lmdb/home/hahnerm/frameworks/torch/install/lib/luarocks/rocks/eladtools/scm-1/lua/eladtools/utils.lua -- Installing: /home/hahnerm/frameworks/lmdb/home/hahnerm/frameworks/torch/install/lib/luarocks/rocks/eladtools/scm-1/lua/eladtools/SpatialNMS.lua -- Installing: /home/hahnerm/frameworks/lmdb/home/hahnerm/frameworks/torch/install/lib/luarocks/rocks/eladtools/scm-1/lua/eladtools/Optimizer.lua -- Installing: /home/hahnerm/frameworks/lmdb/home/hahnerm/frameworks/torch/install/lib/luarocks/rocks/eladtools/scm-1/lua/eladtools/NetConversion.lua -- Installing: /home/hahnerm/frameworks/lmdb/home/hahnerm/frameworks/torch/install/lib/luarocks/rocks/eladtools/scm-1/lua/eladtools/SpatialBottleNeck.lua -- Installing: /home/hahnerm/frameworks/lmdb/home/hahnerm/frameworks/torch/install/lib/luarocks/rocks/eladtools/scm-1/lua/eladtools/SpatialConvolutionDCT.lua -- Installing: /home/hahnerm/frameworks/lmdb/home/hahnerm/frameworks/torch/install/lib/luarocks/rocks/eladtools/scm-1/lua/eladtools/init.lua -- Installing: /home/hahnerm/frameworks/lmdb/home/hahnerm/frameworks/torch/install/lib/luarocks/rocks/eladtools/scm-1/lua/eladtools/testSwallowBN.lua -- Installing: /home/hahnerm/frameworks/lmdb/home/hahnerm/frameworks/torch/install/lib/luarocks/rocks/eladtools/scm-1/lua/eladtools/SelectPoint.lua -- Installing: /home/hahnerm/frameworks/lmdb/home/hahnerm/frameworks/torch/install/lib/luarocks/rocks/eladtools/scm-1/lua/eladtools/EarlyStop.lua -- Installing: /home/hahnerm/frameworks/lmdb/home/hahnerm/frameworks/torch/install/lib/luarocks/rocks/eladtools/scm-1/lua/eladtools/ODCT.lua Updating manifest for /home/hahnerm/frameworks/torch/install/lib/luarocks/rocks eladtools scm-1 is now built and installed in /home/hahnerm/frameworks/torch/install/

But when I then invoke the training with th Main.lua -network AlexNet -LR 0.01, I get this error message:

/home/hahnerm/frameworks/torch/install/bin/luajit: ...rm/frameworks/torch/install/share/lua/5.1/trepl/init.lua:389: module 'eladtools' not found:No LuaRocks module found for eladtools no field package.preload['eladtools'] no file '/home/hahnerm/.luarocks/share/lua/5.1/eladtools.lua' no file '/home/hahnerm/.luarocks/share/lua/5.1/eladtools/init.lua' no file '/home/hahnerm/frameworks/torch/install/share/lua/5.1/eladtools.lua' no file '/home/hahnerm/frameworks/torch/install/share/lua/5.1/eladtools/init.lua' no file './eladtools.lua' no file '/home/hahnerm/frameworks/torch/install/share/luajit-2.1.0-beta1/eladtools.lua' no file '/usr/local/share/lua/5.1/eladtools.lua' no file '/usr/local/share/lua/5.1/eladtools/init.lua' no file '/home/hahnerm/.luarocks/lib/lua/5.1/eladtools.so' no file '/home/hahnerm/frameworks/torch/install/lib/lua/5.1/eladtools.so' no file '/home/hahnerm/frameworks/torch/install/lib/eladtools.so' no file './eladtools.so' no file '/usr/local/lib/lua/5.1/eladtools.so' no file '/usr/local/lib/lua/5.1/loadall.so' stack traceback: [C]: in function 'error' ...rm/frameworks/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require' Main.lua:4: in main chunk [C]: in function 'dofile' ...orks/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00406670

Any help would be much appreciated!

clement-masson commented 6 years ago

For DIGITS I don't know, but for the last error, it looks like your package ha not been installed in the proper location. If you look at your installation output, there are '/home/hahnerm/frameworks/lmdb/home/hahnerm/frameworks/torch/install/lib/luarocks/rocks/eladtools/scm-1/lua/eladtools/ XXX ' all over the place.

/home/hahnerm/frameworks/lmdb/home/hahnerm/frameworks/torch/ ... looks suspicious for a path. It's as if an absolute path had been appended to another absolute path.