mkusner / wmd

Word Mover's Distance from Matthew J Kusner's paper "From Word Embeddings to Document Distances"
537 stars 132 forks source link

Deadlock in Multiprocessing #25

Open dancing-with-coffee opened 6 years ago

dancing-with-coffee commented 6 years ago

Thank you for your implementation of your paper.

First, I tried with your code and data. It worked well. (all_twitter_by_line.txt)

Second, I tried 20newsgroup data which was in your paperwork.

Then, I got

"emd: Maximum number of iterations has been reached 1013"

error because of limitation, MAX_SIG_SIZE 100.

So, I change it to over maximum size of unique keywords in 20newsgroup dataset( =5284).

Now, I have trouble with blocking after some steps.

I think it's because of multiprocessing.

I check CPU availability, it was 99% in multiCPU, multicore environment.

Is there any solution for this?