jcjohnson / torch-rnn

Efficient, reusable RNNs and LSTMs for torch
MIT License
2.5k stars 508 forks source link

error on `train.lua`: expected near ')' at line 579 #58

Open nylki opened 8 years ago

nylki commented 8 years ago

I installed all dependencies and preprocessed a txt (tried to with the provided shakespeare.txt). However train.lua throws some error. What could be the cause of this?

➜  torch-rnn git:(master) ✗ th train.lua -input_h5 my_data.h5 -input_json my_data.json
/home/tom/gits/torch/install/bin/luajit: /home/tom/gits/torch/install/share/lua/5.1/trepl/init.lua:363: /home/tom/gits/torch/install/share/lua/5.1/trepl/init.lua:363: /home/tom/gits/torch/install/share/lua/5.1/hdf5/ffi.lua:56: ';' expected near ')' at line 579
stack traceback:
    [C]: in function 'error'
    /home/tom/gits/torch/install/share/lua/5.1/trepl/init.lua:363: in function 'require'
    train.lua:6: in main chunk
    [C]: in function 'dofile'
    ...gits/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
    [C]: at 0x00405d60
jcjohnson commented 8 years ago

Looks like a problem with the HDF5 library. Make sure you've installed the correct one: https://github.com/deepmind/torch-hdf5

gphuang commented 8 years ago

Hi, Justin,

Same problem here, and it persists.

I installed torch-hd5 locally on ubuntu 14.04, following 'https://github.com/deepmind/torch-hdf5/blob/master/doc/usage.md' Message: hdf5 0-0 is now built and installed in /people/huang/tools/torch/install/ (license: BSD)

I tried to run on a cluster with gpu, and train.lua fails because of hdf5.

Environment settings set torch_rnn=/people/huang/tools/torch-rnn setenv PATH /people/huang/tools/torch/install/bin:${PATH}
setenv LD_LIBRARY_PATH /people/huang/tools/torch/install/lib:${LD_LIBRARY_PATH} setenv PATH /usr/local/cuda/bin:${PATH} setenv LD_LIBRARY_PATH /usr/local/cuda/lib64:${LD_LIBRARY_PATH} setenv PYTHONPATH /people/huang/local/canopy/User/lib/python2.7/site-packages/

Error message when running train.lua th $torch_rnn/train.lua -input_h5 $tmp/lm_lstm_torch/data/my_data.h5 -input_json $tmp/lm_lstm_torch/data/my_data.json -model_type lstm -num_layers 3 -rnn_size 512 -gpu_backend opencl > $tmp/lm_lstm_torch/lm_lstm_torch.log /people/huang/tools/torch/install/bin/luajit: ...e/huang/tools/torch/install/share/lua/5.1/trepl/init.lua:363: ...e/huang/tools/torch/install/share/lua/5.1/trepl/init.lua:363: ...ple/huang/tools/torch/install/share/lua/5.1/hdf5/ffi.lua:29: libhdf5.so: cannot open shared object file: No such file or directory stack traceback: [C]: in function 'error' ...e/huang/tools/torch/install/share/lua/5.1/trepl/init.lua:363: in function 'require' /people/huang/tools/torch-rnn/train.lua:6: in main chunk [C]: in function 'dofile' ...ools/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk [C]: at 0x00406670

$tmp/lm_lstm_torch/lm_lstm_torch.log /people/huang/tools/torch/install/share/lua/5.1/hdf5/init.lua:15 Unable to find the HDF5 lib we were built against - trying to find it elsewhere

nylki commented 8 years ago

@jcjohnson I installed the hdf5 lib as specified in your readme:

git clone https://github.com/deepmind/torch-hdf5
cd torch-hdf5
luarocks make hdf5-0-0.rockspec

I am on Fedora 23 btw.

jcjohnson commented 8 years ago

Does the following command work?

th -e "require 'hdf5'"
gphuang commented 8 years ago

Thanks for the quick reply.

The command gives the same error.

th -e "require 'hdf5'" ...le/huang/tools/torch/install/share/lua/5.1/hdf5/init.lua:15 Unable to find the HDF5 lib we were built against - trying to find it elsewhere ...e/huang/tools/torch/install/share/lua/5.1/trepl/init.lua:363: ...ple/huang/tools/torch/install/share/lua/5.1/hdf5/ffi.lua:29: libhdf5.so: cannot open shared object file: No such file or directory

However, the hdf5 files are in such a directory as shown in the error message ls -l /people//huang/tools/torch/install/share/lua/5.1/hdf5/ total 60 -rw-r--r-- 1 huang grptlp 262 Apr 11 15:30 config.lua -rw-r--r-- 1 huang grptlp 6102 Apr 11 15:30 dataset.lua -rw-r--r-- 1 huang grptlp 3709 Apr 11 15:30 datasetOptions.lua -rw-r--r-- 1 huang grptlp 13057 Apr 11 15:30 ffi.lua -rw-r--r-- 1 huang grptlp 5209 Apr 11 15:30 file.lua -rw-r--r-- 1 huang grptlp 10622 Apr 11 15:30 group.lua -rw-r--r-- 1 huang grptlp 3037 Apr 11 15:30 init.lua -rw-r--r-- 1 huang grptlp 1372 Apr 11 15:30 testUtils.lua

nylki commented 8 years ago

@jcjohnson Same error here as well:

➜  torch-rnn git:(master) th -e "require 'hdf5'"
/home/tom/torch/install/share/lua/5.1/trepl/init.lua:363: /home/tom/torch/install/share/lua/5.1/hdf5/ffi.lua:56: ';' expected near ')' at line 579  
jcjohnson commented 8 years ago

To make HDF5 work you also need to install the C library; the lua package is just a wrapper. For example on Ubuntu you need to run

sudo apt-get install libhdf5-dev

Did you do that?

nylki commented 8 years ago

@jcjohnson Yep. On Fedora I did dnf install hdf5-devel. It's version 1.8.15.

jcjohnson commented 8 years ago

@gp-huang Sorry, I'm not sure how to do that - I've never tried to install locally.

@nylki Can you take a look at the file that is throwing a syntax error, and maybe paste it as a gist or pastebin? Does it look like this?

https://github.com/deepmind/torch-hdf5/blob/f364b442655b0fe21dafe83104f42c3bb7b2a594/luasrc/ffi.lua

The fact that you are getting a syntax error is very strange - it makes me think that somehow your torch-hdf5 install got corrupted.

nylki commented 8 years ago

@jcjohnson I just did a diff on the file I have on my system and the one you linked. They are identical. (for reference here is mine: https://gist.github.com/nylki/d823e303a8faa0b185895998f38a1524 )

Thanks for not giving up on me so far! :) What else could possibly result in the syntax error?

jcjohnson commented 8 years ago

@nylki We are in the territory of debugging torch-hdf5 now, which I don't know much about; you might have better luck opening an issue over there instead.

But what is basically going on in this file is that Lua code is programmatically generating C code, and then using the luajit foreign function interface API to compile the C code and expose it as a set of Lua functions. The syntax error is happening when luajit tries to compile the generated C code.

I'd try adding a print statement here

https://gist.github.com/nylki/d823e303a8faa0b185895998f38a1524#file-ffi-lua-L56

to see whether cdef looks like valid C code, or whether it has some C syntax error.

Another possibility is that whatever C compiler is getting invoked on your system by luajit is somehow different or more strict than the one torch-hdf5 was expecting; I'm not sure how to debug that.

MariumS commented 8 years ago

I also commented on the blog post,

"All the steps have worked fine until running the training program. At that step I get this error:

Mariums-MacBook-Pro:torch-rnn mariumsultan$ th train.lua =input_h5 data/Dracula.h5 -input_json data/Dracula.json-gpu-1 /Users/mariumsultan/torch/install/bin/luajit: …/mariumsultan/torch/install/share/lua/5.1/trepl/init.lua:384: …/mariumsultan/torch/install/share/lua/5.1/trepl/init.lua:384: …rs/mariumsultan/torch/install/share/lua/5.1/hdf5/ffi.lua:42: Error: unable to locate HDF5 header file at hdf5.h stack traceback: [C]: in function ‘error’ …/mariumsultan/torch/install/share/lua/5.1/trepl/init.lua:384: in function ‘require’ train.lua:6: in main chunk [C]: in function ‘dofile’ …ltan/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x01031bfcf0 Mariums-MacBook-Pro:torch-rnn mariumsultan$

Any suggestions?"

I'd like to add that my Hdf5 is built and installed.

screen shot 2016-08-01 at 3 16 16 pm

and that the suggested libhdf5-dev didn't install through brew

marcociccone commented 8 years ago

I don't know if there is bug in the building process of torch-hdf5 but in my case I got this error because it didn't find the include path of hdf5.h, so I added it by hand in the config file /home/marcus/torch/install/share/lua/5.1/hdf5/config.lua.

Just locate your hdf5.h (presumably /usr/include) and then set the correct variable:

HDF5_INCLUDE_PATH = "/usr/include"

I hope it solves your issue as well.

SeaCelo commented 8 years ago

@MarcoCiccone That worked for me after two days of searching the answer. Thanks.

Just a note: I added two paths since I don't think that /usr/include has the right file. My config.lua now looks like this: hdf5._config = { HDF5_INCLUDE_PATH = "", HDF5_INCLUDE_PATH = "/usr/include", HDF5_INCLUDE_PATH = "/usr/local/include", HDF5_LIBRARIES = "/usr/lib/libpthread.dylib;/usr/local/lib/libhdf5_cpp.dylib;/usr/local/lib/libhdf5.dylib;/usr/local/lib/libsz.dylib;/usr/lib/libz.dylib;/usr/lib/libdl.dylib;/usr/lib/libm.dylib" }

I tested that it worked by running th -e "require 'hdf5'"

yoosan commented 8 years ago

Hi, I got the solution for this issue.(Mac OS 10.11) If you install hdf5 by brew install hdf5, the hdf5 will be installed at /usr/local/Cellar/hdf5. Once you installed torch-hdf5 from the deepmind repo, you should edit the config.lua at /Users/yoosan/torch/install/share/lua/5.1/hdf5 (replace the torch path) with HDF5_INCLUDE_PATH="/usr/local/Cellar/hdf5/1.8.16_1/include" (note the version).

Saintis commented 8 years ago

To extend on @yoosan's answer above I found that adding /usr/local/include worked as well.

0bserver07 commented 8 years ago

/Users/yad/torch/install/share/lua/5.1/trepl/init.lua:384: /Users/yad/.luarocks/share/lua/5.1/hdf5/ffi.lua:42: Error: unable to locate HDF5 header file at hdf5.h

I edited the line 42 to include the path and it works now.

xanderdunn commented 8 years ago

@nylki: This is the solution to your original problem.

sfsekaran commented 7 years ago

To make it easier for people to copy/paste and understand, here's what my torch/install/share/lua/5.1/hdf5/config.lua looks like before:

hdf5._config = {
  HDF5_INCLUDE_PATH = "",
  HDF5_LIBRARIES = "/usr/local/lib/libhdf5.dylib;/usr/local/lib/libsz.dylib;/usr/lib/libz.dylib;/usr/lib/libdl.dylib;/usr/lib/libm.dylib"
}

and after:

hdf5._config = {
  HDF5_INCLUDE_PATH = "/usr/local/include",
  HDF5_LIBRARIES = "/usr/local/lib/libhdf5.dylib;/usr/local/lib/libsz.dylib;/usr/lib/libz.dylib;/usr/lib/libdl.dylib;/usr/lib/libm.dylib"
}

:smile:

grishmarao commented 7 years ago

Hi, I'm having trouble running th train.lua -- think it's an HDF5 problem , because I tried running: th -e "require 'hdf5'"

and I'm getting this error. /Users/Apple/torch/install/share/lua/5.1/trepl/init.lua:389: /Users/Apple/torch/install/share/lua/5.1/hdf5/config.lua:2: unexpected symbol near 'local'

I've edited my config.lua file to the above answers content, that helped with the initial HDF5 errors. Any ideas on how to fix this? Thanks

chidg commented 7 years ago

@grishmarao, it sounds like you just have a syntax error in the config file. Would you like to paste what you've got?

McLawrence commented 7 years ago

Changing the config.lua file doesn't solve the problem for me. My error code never included the nable to locate HDF5 header file at hdf5.h message. I just get: th -e "require 'hdf5'" ..<UserName>/torch/install/share/lua/5.1/trepl/init.lua:389: ...s/<UserName>/torch/install/share/lua/5.1/hdf5/ffi.lua:56: ')' expected near '_close' at line 1436

dj2mn commented 7 years ago

I was having the same ' hdf5 header not found' error, which I resolved by making the config.lua have one and only one path in it.. i.e. HDF5_INCLUDE_PATH = "/usr/local/Cellar/hdf5/1.10.1_2/include" /usr/local/include didn't work on it's own - I suspect that it was having trouble following the symlinks. I also suspect that ; wasn't working as a delimiter when I had both paths in place, there was something fishy in the text of the error that it threw in that case.

However that's all resolved now and I've got the same issue as @McLawrence and @grishmarao .. well the same error but on different lines. I must have a different version of something..

/Users/.../torch/install/bin/luajit: /Users/.../torch/install/share/lua/5.1/trepl/init.lua:389: /Users/.../torch/install/share/lua/5.1/trepl/init.lua:389: /Users/.../torch/install/share/lua/5.1/hdf5/ffi.lua:56: ')' expected near '_close' at line 3401

OSX 10.12.6 home-brew 1.3.1, FWIW.

mendadala commented 7 years ago

I am also getting the same error: /Users/mars/torch/install/bin/luajit: /Users/mars/torch/install/share/lua/5.1/trepl/init.lua:389: /Users/mars/torch/install/share/lua/5.1/trepl/init.lua:389: /Users/mars/torch/install/share/lua/5.1/hdf5/ffi.lua:56: ')' expected near '_close' at line 1437 stack traceback: [C]: in function 'error' /Users/mars/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require' train.lua:6: in main chunk [C]: in function 'dofile' ...mars/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x010e18ea10

cameronblandford commented 7 years ago

Getting the same error as well but different line numbers, same system specs / versions as @mendadala.

/Users/cam/torch/install/bin/luajit: /Users/cam/torch/install/share/lua/5.1/trepl/init.lua:389: /Users/cam/torch/install/share/lua/5.1/trepl/init.lua:389: /Users/cam/torch/install/share/lua/5.1/hdf5/ffi.lua:56: ')' expected near '_close' at line 1472
stack traceback:
    [C]: in function 'error'
    /Users/cam/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require'
    train.lua:6: in main chunk
    [C]: in function 'dofile'
    .../cam/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
    [C]: at 0x010969ba10

I'm guessing a dependency might be out of date somewhere? Seems more likely than a typo that made it into production.

dj2mn commented 7 years ago

I've solved my issues and got it all working! I've lost the place where I found the solution to the ')' expected near '_close' error, but the solution was to edit line 44 of install/share/lua/5.1/hdf5/ffi.lua to read

local process = io.popen("gcc -D '_Nullable=' -E " .. headerPath) -- TODO pass -I

then brew install hdf5@1.8 mv /usr/local/Cellar/hdf5@1.8/1.8.19 /usr/local/Cellar/hdf5/

then adjust install/share/lua/5.1/hdf5/config.lua so it now reads

hdf5._config = { HDF5_INCLUDE_PATH = "/usr/local/Cellar/hdf5/1.8.19/include", HDF5_LIBRARIES = "/usr/local/Cellar/hdf5/1.8.19/lib/libhdf5.dylib;/usr/local/opt/szip/lib/libsz.dylib;/usr/lib/libz.dylib;/usr/lib/libdl.dylib;/usr/lib/libm.dylib" }

mendadala commented 7 years ago

Hello all, thanks a tone for al your heIp, I finally resolved all my errors. Now, after all this what i went through it is giving me "out of memory" error!

MARSs-MBP:torch-rnn neptune$ th train.lua -input_h5 data/tiny_shakespeare.h5 -input_json data/tiny_shakespeare.json Running with CUDA on GPU 0
THCudaCheck FAIL file=/Users/neptune/torch/extra/cutorch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory /Users/neptune/torch/install/bin/luajit: /Users/neptune/torch/install/share/lua/5.1/nn/utils.lua:11: cuda runtime error (2) : out of memory at /Users/neptune/torch/extra/cutorch/lib/THC/generic/THCStorage.cu:66 stack traceback: [C]: in function 'resize' /Users/neptune/torch/install/share/lua/5.1/nn/utils.lua:11: in function 'torch_Storage_type' /Users/neptune/torch/install/share/lua/5.1/nn/utils.lua:57: in function 'recursiveType' /Users/neptune/torch/install/share/lua/5.1/nn/Module.lua:160: in function 'type' /Users/neptune/torch/install/share/lua/5.1/nn/utils.lua:45: in function 'recursiveType' /Users/neptune/torch/install/share/lua/5.1/nn/utils.lua:41: in function 'recursiveType' /Users/neptune/torch/install/share/lua/5.1/nn/Module.lua:160: in function 'type' train.lua:96: in main chunk [C]: in function 'dofile' ...tune/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x0106f02bd0

PLEASE HELP!

mendadala commented 7 years ago

Solved this error by shutting down the system and freshly ran the code. All worked.!

cameronblandford commented 7 years ago

My issue was also fixed! Thanks @dj2mn :+1:

Benimation commented 7 years ago

For some reason my config file contained this: HDF5_INCLUDE_PATH = "/usr/local/Cellar/hdf5/1.8.19/include;/usr/local/opt/szip/include"

Which I simply changed to this: HDF5_INCLUDE_PATH = "/usr/local/Cellar/hdf5/1.8.19/include"

I have no idea where the extra ;/usr/local/opt/szip/include came from..

It now works, I am using macOS High Sierra (10.13)

timendez commented 6 years ago

@Benimation I've done that but it continues to give me the old path in the error. Did you need to do anything aside from change the file?

Benimation commented 6 years ago

@timendez I don't remember exactly what I did.. Probably about everything that's being suggested on this page..

timendez commented 6 years ago

@Benimation That's alright! I figured out I had two installations of hdf5 across two separate torches, which was messing a lot of stuff up. Removing one caused everything else to work