segmentation fault in rnnlm-hs

nyaong7 commented 8 years ago

Hello!!

I am testing with rnnlm-hs, and encountered a segmentation fault. Here's the error message, which I get from using gdb.

Alpha: 0.100000 ME-alpha: 0.100000 Progress: 99.94% Words/thread/sec: 2.58k Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffa5ffb700 (LWP 17818)] TrainModelThread (id=) at rnnlm.c:495 495 l2 = vocab[word].point[d] * layer1_size;

My data is maybe somewhat large, (It was successfully done when I used smaller sized data)

Vocab size: 52722 Words in train file: 285559160

And here's my setting for the test.

./rnnlm -train data/train -valid data/valid -hidden 300 -bptt 6 -threads 20 -alpha 0.1 -maxent-order 5 -maxent-size 1000 -rnnlm ./model.rnnlm -debug 2

The following is the constant settings in the code (rnnlm.c)

define MAX_STRING 1024

define MAX_SENTENCE_LENGTH 10000

define MAX_CODE_LENGTH 40

How could I avoid or debug this error? Please help if anybody could.

Thank you very much.

vimalmanohar commented 8 years ago

Try to pull the latest code from https://github.com/kaldi-asr/kaldi. It may be a bug that was fixed. You can create a new issue there if it does not work.

On Mon, Feb 15, 2016 at 11:47 PM Heejun Song notifications@github.com wrote:

Hello!!

I am testing with rnnlm-hs, and encountered a segmentation fault. Here's the error message, which I get from using gdb.

Alpha: 0.100000 ME-alpha: 0.100000 Progress: 99.94% Words/thread/sec: 2.58k Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffa5ffb700 (LWP 17818)] TrainModelThread (id=) at rnnlm.c:495 _495 l2 = vocab[word].point[d] * layer1size;

My data is maybe somewhat large, (It was successfully done when I used smaller sized data)

Vocab size: 52722 Words in train file: 285559160

And here's my setting for the test.

./rnnlm -train data/train -valid data/valid -hidden 300 -bptt 6 -threads 20 -alpha 0.1 -maxent-order 5 -maxent-size 1000 -rnnlm ./model.rnnlm -debug 2

The following is the constant settings in the code (rnnlm.c)

define MAX_STRING 1024

define MAX_SENTENCE_LENGTH 10000

define MAX_CODE_LENGTH 40

How could I avoid or debug this error? Please help if anybody could.

Thank you very much.

— Reply to this email directly or view it on GitHub https://github.com/vimalmanohar/kaldi-git/issues/5.

Vimal Manohar PhD Student Electrical & Computer Engineering Johns Hopkins University

nyaong7 commented 8 years ago

Thank you for the advise.

Unfortunately, I could not find rnnlm-hs-0.1b from the link that you suggested (https://github.com/kaldi-asr/kaldi) I found that there is an installation script for the original RNNLM (rnnlm-0.3e), however couldn't find for rnnlm-hs. Can I find the latest version somewhere else?

Thank you very much for the help.

vimalmanohar commented 8 years ago

https://github.com/yandex/faster-rnnlm

I don't know if this is what you are looking for.

Kaldi stopped supporting this because we weren't getting good enough results.

Vimal

On Thu, Feb 18, 2016, 21:12 Heejun Song notifications@github.com wrote:

Thank you for the advise.

Unfortunately, I could not find rnnlm-hs-0.1b from the link that you suggested (https://github.com/kaldi-asr/kaldi) I found that there is an installation script for the original RNNLM (rnnlm-0.3e), however couldn't find for rnnlm-hs. Can I find the latest version somewhere else?

Thank you very much for the help.

— Reply to this email directly or view it on GitHub https://github.com/vimalmanohar/kaldi-git/issues/5#issuecomment-186014139 .

Vimal Manohar PhD Student Electrical & Computer Engineering Johns Hopkins University

nyaong7 commented 8 years ago

Thank you for such a quick reply!!

I believe that they are different. I am using rnnlm-hs-0.1b from https://github.com/vimalmanohar/kaldi-git/tree/master/tools/rnnlm-hs-0.1b. I tried to find the same directory (rnnlm-hs-0.1b) from https://github.com/vimalmanohar/kaldi-git/tree/master/tools/rnnlm-hs-0.1b, however I could not find the one similar to what I was looking for.

Thank you very much!

vimalmanohar commented 8 years ago

The link that I sent is the latest version of rnnlm-hs-0.1b that you are using. It is an RNNLM toolkit created by Ilya Edrenkin, Yandex LLC ( httisps://github.com/ilya-edrenkin/faster-rnnlm https://github.com/ilya-edrenkin/faster-rnnlm). Although the latest version is the one from the previous link I sent.

On Thu, Feb 18, 2016 at 11:19 PM Heejun Song notifications@github.com wrote:

Thank you for such a quick reply!!

I believe that they are different. I am using rnnlm-hs-0.1b from https://github.com/vimalmanohar/kaldi-git/tree/master/tools/rnnlm-hs-0.1b. I tried to find the same directory (rnnlm-hs-0.1b) from https://github.com/vimalmanohar/kaldi-git/tree/master/tools/rnnlm-hs-0.1b, however I could not find the one similar to what I was looking for.

Thank you very much!

— Reply to this email directly or view it on GitHub https://github.com/vimalmanohar/kaldi-git/issues/5#issuecomment-186043390 .

Vimal Manohar PhD Student Electrical & Computer Engineering Johns Hopkins University

nyaong7 commented 8 years ago

ok, I see. I was confused since the names are different. Thank you very much!

vimalmanohar / old-kaldi-git

segmentation fault in rnnlm-hs #5

define MAX_STRING 1024

define MAX_SENTENCE_LENGTH 10000

define MAX_CODE_LENGTH 40

define MAX_STRING 1024

define MAX_SENTENCE_LENGTH 10000

define MAX_CODE_LENGTH 40