mkusner / wmd

Word Mover's Distance from Matthew J Kusner's paper "From Word Embeddings to Document Distances"
537 stars 132 forks source link

Paralleling processing #6

Open josepablog opened 7 years ago

josepablog commented 7 years ago

Wow, great paper! Thank you for making the code OSS.

The documentation says that the Python wrapper is not suitable for parallel execution:

The wrapper is not suited for concurrent execution. It uses a global variable for the distance callback function, so calling emd from concurrent threads will result in undefined behavior.

However, the function get_wmd calls emd concurrently. Can you please explain?

mkusner commented 7 years ago

Ah this looks like an oversight indeed! If you need parallel processing, maybe gensim's WMD code supports this? https://radimrehurek.com/gensim/models/word2vec.html#gensim.models.word2vec.Word2Vec.wmdistance