nicodv / kmodes

Python implementations of the k-modes and k-prototypes clustering algorithms, for clustering categorical data
MIT License
1.24k stars 417 forks source link

parallelization #173

Closed KhatriVivek closed 2 years ago

KhatriVivek commented 2 years ago

Would it be possible to modify the below code and illustrate for parallel computing on several cores

import numpy as np from kmodes.kmodes import KModes

random categorical data

data = np.random.choice(20, (100, 10))

km = KModes(n_clusters=4, init='Huang', n_init=5, verbose=1)

clusters = km.fit_predict(data)

Print the cluster centroids

print(km.clustercentroids

nicodv commented 2 years ago

Parallelization over multiple initializations is already possible: https://github.com/nicodv/kmodes#parallel-execution

KhatriVivek commented 2 years ago

Thanks. I did see that. I am more familiar with R than Python, and if it was an easy change, I was asking for help

On Tue, Mar 15, 2022 at 5:08 PM Nico de Vos @.***> wrote:

Parallelization over multiple initializations is already possible: https://github.com/nicodv/kmodes#parallel-execution

— Reply to this email directly, view it on GitHub https://github.com/nicodv/kmodes/issues/173#issuecomment-1068474692, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3S26RJV7CGK32NIE45V6TVAD33TANCNFSM5QZYSDWQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

nicodv commented 2 years ago

Yeah, so just change it to:

km = KModes(n_clusters=4, init='Huang', n_init=5, n_jobs=<minimum of number of cores or n_init here>, verbose=1)
KhatriVivek commented 2 years ago

Excellent. Thank you so much

On Tue, Mar 15, 2022 at 5:35 PM Nico de Vos @.***> wrote:

Yeah, so just change it to:

km = KModes(n_clusters=4, init='Huang', n_init=5, n_jobs=, verbose=1)

— Reply to this email directly, view it on GitHub https://github.com/nicodv/kmodes/issues/173#issuecomment-1068495464, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3S26W3KDI6VAGUILQQLETVAD7DLANCNFSM5QZYSDWQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>