cemoody / lda2vec

MIT License
3.15k stars 629 forks source link

lda2vec preprocess not good support about memory #59

Open hxsnow10 opened 6 years ago

hxsnow10 commented 6 years ago

when i use 6G pure text, 3 threads, max_len=4W, they run out my 120Gmeory+180Gswap.

i guess may be some point unfriendly with big data.

ghost commented 5 years ago

I've encountered MemoryError several times and now have to fall back to a smaller amount of data.