J535D165 / recordlinkage

A powerful and modular toolkit for record linkage and duplicate detection in Python
http://recordlinkage.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
966 stars 152 forks source link

For when support for packages like Dask or Ray (or Modin)? #182

Open ialvata opened 2 years ago

ialvata commented 2 years ago

Right now, if we're dealing with big data, this package seems rather slow and unoptimized. For example, I have 16 threads on my laptop, and only 1 thread is used when doing the record linkage... It would be so simple to just use Modin...