cemoody / lda2vec

MIT License
3.15k stars 629 forks source link

Segmentation fault (core dumped) error in preprocess.py #72

Open ghost opened 6 years ago

ghost commented 6 years ago

Hi,

I tried following the example of preprocessing the hacker news data on my data set with 640,000 rows, 154 MB. It throws an error Segmentation fault (core dumped) when I ran

tokens, vocab = preprocess.tokenize(texts, max_length, n_threads=4, merge=True)

in line 46 of the code. I thought the data set was too large and tried reducing it to only 100 rows but the same error apply. How could I resolve the issue?

Thank you, Janice