Jekub / Wapiti

A simple and fast discriminative sequence labeling toolkit ( http://wapiti.limsi.fr )
Other
251 stars 86 forks source link

segfault when training with bcd #10

Open gromgull opened 10 years ago

gromgull commented 10 years ago

Hi,

Wapiti segfaults on my linux box when training on a moderately large input file with the bcd algorithm:

  [snip]
  31000 sequences loaded
  32000 sequences loaded
  33000 sequences loaded
  34000 sequences loaded
* Initialize the model
* Summary
    nb train:    323212
    nb devel:    34346
    nb labels:   7
    nb blocks:   15080528
    nb features: 105563745
* Train the model with bcd
    - Build the index
        1/2 -- scan the sequences
./train.sh: line 1: 16978 Segmentation fault      (core dumped) wapiti train --compact --algo bcd --pattern patterns/brownpattern-2-self.txt --devel data/no-wiki/no-wiki-more-doc-only-all-brown_ak-ak1kmin10kpos-devel data/no-wiki/no-wiki-more-doc-only-all-brown_ak-ak1kmin10kpos-train models/wapiti/no-wiki-more-doc-only-all-self-bcd-brown_ak-ak1kmin10kpos-train

Rebuilding wapiti with debugging info gives me the not very useful stack-trace:

(gdb) bt
#0  0x000000000040284a in trn_bcd (mdl=<optimised out>) at src/bcd.c:298
#1  0x0000000000401ea6 in dotrain (mdl=0xab88d0) at src/wapiti.c:161
#2  main (argc=<optimised out>, argv=<optimised out>) at src/wapiti.c:401

Training with rprop works fine - I was just curious if I could train faster or a better model with bcd.