nicodv / kmodes

Python implementations of the k-modes and k-prototypes clustering algorithms, for clustering categorical data
MIT License
1.23k stars 416 forks source link

Add sample weights to KPrototypes. #171

Closed kklein closed 2 years ago

kklein commented 2 years ago

As of now, every data point contributes equally to the loss function and derived cluster updates.

Yet, in some use cases, it might be desirable to attach weights to data points.

This PR introduces sample_weights, a sequence of numeric values, as an optional parameter for KPrototypes' fit method as well as all downstream functions.

Some basic input validation as well as some testing are provided.

kklein commented 2 years ago

Hi @nicodv ! Happy to hear your thoughts. :)

coveralls commented 2 years ago

Coverage Status

Coverage increased (+1.3%) to 97.908% when pulling 03e9ac68cbf923e8d53223def7e4f98fe542c802 on kklein:master into 370d64b1067331b413d641103a52bd4c636ac702 on nicodv:master.

kklein commented 2 years ago

Thanks a bunch for your fast and very useful feedback! :)