Closed SuperYorio closed 4 years ago
Hi @SuperYorio ,
Thanks for your kind message, I'm really happy that you found my book helpful! :)
I haven't had the chance to tackle many categorical clustering problems, so I don't really have a preferred method, sadly. Besides looking at the basic clustering algorithms (like K-Means), I would take a look at autoencoders: you can train an autoencoder (using Embedding layers for the categorical attributes), and then cluster the learned codings using any clustering algorithm.
There seems to be an abundant literature on categorical clustering, perhaps look for a recent paper that summarizes the state of the art in this domain?
Hope this helps a tiny bit...
Hello Geron! Thank you for the great book, it is no understatement to say that it has helped me advanced my career! As a data science research intern at a Medical School, I have a question:
What is YOUR most preferred cluster algorithm for clustering categorical attributes?
I'm trying to find ways to cluster a fake but close to real-world patient dataset that include Gender, Race, Medication & Procedural Record (the meat of the data). I understand that there are algorithms such as "K-Modes" etc, but I am just curious about what your favorite algorithm would be!
Thank you again for the great book and your time! :)