Segmentation fault (core dumped) error in preprocess.py

Hi,

I tried following the example of preprocessing the hacker news data on my data set with 640,000 rows, 154 MB. It throws an error Segmentation fault (core dumped) when I ran

tokens, vocab = preprocess.tokenize(texts, max_length, n_threads=4, merge=True)

in line 46 of the code. I thought the data set was too large and tried reducing it to only 100 rows but the same error apply. How could I resolve the issue?

Thank you, Janice

cemoody / lda2vec

Segmentation fault (core dumped) error in preprocess.py #72