Jekub / Wapiti

A simple and fast discriminative sequence labeling toolkit ( http://wapiti.limsi.fr )
Other
251 stars 86 forks source link

train large corpus throws segmentation fault #1

Closed zhangmiao closed 11 years ago

zhangmiao commented 11 years ago

I used wapiti to train a CRFs model.

When the size of data file was 11M, it's ok. But when the size came to 24M, the training model OS threw a segmentation fault.

I traced the bug and discovered you wrote "uint32_t out[T]" to declare an array in function "tag_evalsub" of "src/decoder.c".

I suggest to use "xmalloc" to fix the problem.

Thank you for sharing.

Jekub commented 11 years ago

Thanks for reporting this, it is now fixed. This one was missed when the switch to dynamic allocation was done.

Murhaf commented 11 years ago

I am still getting a `segfault' error, but the size of my data set is very large, 523M. It's worth mentioning that I have successfully trained CRFs with files (data sets) of 100M size.

Thanks a lot!