mkusner / wmd

Word Mover's Distance from Matthew J Kusner's paper "From Word Embeddings to Document Distances"
537 stars 131 forks source link

i want to use WMD to train chinese data,there's some errors ,plz help me! #20

Open 15611511155 opened 7 years ago

15611511155 commented 7 years ago

root@user-virtual-machine:/home/user/WMD# python wmd.py asd.pk asdwmd.pk [pool :] <multiprocessing.pool.Pool object at 0x7f327f1cc150> 0 out of 3 1 out of 3 emd: Signature size is limited to 100 2 out of 3 emd: Signature size is limited to 100


stop.txt and training data all use in chinese. how can i solve this problem???

kfzyqin commented 6 years ago

mkusner/wmd#18

you have to modify some lines in "python-emd-master\emd.h" (for python) or "emd\emd.h" (for matlab )

find definitions and edit MAX_SIG_SIZE

define MAX_SIG_SIZE 100

change 100 to bigger number (actually, the size of biggest document BOW). remember that you must re-make.