Lyonk71 / pandas-dedupe

Simplifies use of the Dedupe library via Pandas
135 stars 30 forks source link

Pandas_dedupe Not working on Windows #37

Open cfatls opened 3 years ago

cfatls commented 3 years ago

Installed microsoft build tools 2015 as per the video tutorial mentioned in the docs https://www.youtube.com/watch?v=lCFEzRaqoJA&ab_channel=KeithLyons

However, whenever trying to run , it gets stuck on the: "Finished labeling Clustering... "

Works on linux with no hassle. On windows, however, can't get past clustering, regardless of how large or small the file I'm deduping is. Please help. Any advice is appreciated.

I'm running windows 10. Python 3.8.3 (default, Jul 2 2020, 17:30:36) [MSC v.1916 64 bit (AMD64)]

belkacem-ayachi commented 3 years ago

Same issue here

ieriii commented 3 years ago

Unfortunately, I couldn't reproduce the issue on my windows machine.

Two potential suggestions: . try to set the argument n_cores to a value lower than the number of cores on your laptop. . install python and pandas-dedupe in a new virtual environment and see whether it works.

Thanks.