percyliang / brown-cluster

C++ implementation of the Brown word clustering algorithm.
423 stars 132 forks source link

Fix in strdb.cc file for handling large file reads #18

Open ajalagam opened 7 years ago

ajalagam commented 7 years ago

strdb.cc terminates with segmentation fault when run on large data files of say 5 GB in size. This commit has a fix for this issue.

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff71293ab in __memcpy_ssse3_back () from /lib64/libc.so.6

0x0000000000422558 in read_text (file=<optimized out>, func=0x409ed0 <read_text_process_word(int)>, db=..., read_cached=<optimized out>,
    write_cached=<optimized out>, incorp_new=true) at basic/strdb.cc:189