online-ml / river

🌊 Online machine learning in Python
https://riverml.xyz
BSD 3-Clause "New" or "Revised" License
5.03k stars 540 forks source link

Make kNN more resilient to nominal data #1566

Closed e10e3 closed 3 months ago

e10e3 commented 3 months ago

Fixes #1565 by using a different formula for the difference when the data is not numeric.

This allows the distance not to error out when non-numbers are given, while keeping a reasonable behaviour.

Caveat

Because of the added code paths that are difficult to optimise away, kNN is roughly twice as slow to make a prediction.

smastelini commented 3 months ago

Please, see my comments on #1565.