online-ml / river

🌊 Online machine learning in Python
https://riverml.xyz
BSD 3-Clause "New" or "Revised" License
5.09k stars 550 forks source link

Transformation after classifier #633

Closed snthibaud closed 3 years ago

snthibaud commented 3 years ago

Is it possible to put a transformation after a classifier? I would like to use a KNN classifier to find the nearest neighbour from a large number of labels. Then I would like to use some distance measure to either reject or accept the nearest neighbour (close enough to be a match or not). Finally, it would be nice to apply a binary metric on the resulting binary predictions. Is it possible to construct this with river?

MaxHalford commented 3 years ago

Hello! Sorry for the late answer: holidays etc.

I would like to use a KNN classifier to find the nearest neighbour from a large number of labels.

Can you clarify? This doesn't make a lot of sense to me. Are you trying to do classification?

Then I would like to use some distance measure to either reject or accept the nearest neighbour (close enough to be a match or not).

So this is some logic that you would like to add to the nearest neighbor classifier? As in, classify as None or return uniform probabilities if the nearest neighbors are too far, correct?

Finally, it would be nice to apply a binary metric on the resulting binary predictions.

What do you mean? A metric from the metrics module?

snthibaud commented 3 years ago

Can you clarify? This doesn't make a lot of sense to me. Are you trying to do classification?

Apologies for the confusion. I was actually just trying to find the label of the nearest neighbour in an efficient way. I think this is basically classification, but in this case the number of labels was close to the number of instances.

So this is some logic that you would like to add to the nearest neighbor classifier? As in, classify as None or return uniform probabilities if the nearest neighbors are too far, correct?

Yes, that's correct.

What do you mean? A metric from the metrics module?

Yes. I found the MCC metric in the metrics module which works on binary labels, but it I needed to first convert the labels from my custom classifier (as you said, it can return None).

I was able to resolve this myself by writing some custom components, but still using your beautiful framework. Thank you for getting back to me! 👍