J535D165 / recordlinkage

A powerful and modular toolkit for record linkage and duplicate detection in Python
http://recordlinkage.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
966 stars 152 forks source link

threshold in at compere is broken #161

Open skuam opened 3 years ago

skuam commented 3 years ago

I am referring to this code as NumPy array no longer have method where, and you get error

    c = c.where((c < self.threshold) | (pandas.isnull(c)), other=1.0)
AttributeError: 'numpy.ndarray' object has no attribute "where"
 if self.threshold is not None:
          c = c.where((c < self.threshold) | (pandas.isnull(c)), other=1.0)
          c = c.where((c >= self.threshold) | (pandas.isnull(c)), other=0.0)

in https://github.com/J535D165/recordlinkage/blob/5b3230f5cff92ef58968eedc451735e972035793/recordlinkage/compare.py#L152

much cleaner solution would to just use this instead this

if self.threshold is not None:
          c = c >= self.threshold
          c= c.astype(float)