Open flying-sheep opened 5 years ago
Thanks! This is the best example I could find that compares them: https://jmonlong.github.io/Hippocamplus/2018/02/13/tsne-and-clustering/ (comparison is to Louvain, but I'm assuming it would be similar).
I will try to reproduce this notebook with open data and add this algorithm when I do.
In general I'm more interested in continuous "labels" than clustering because I think our eyes can pick out more detail from the continuous variation, but I'm very curious what happens here.
Yup, the continuous labels things is closer to the truth than clustering when there’s a lot of continuous transitions going on.
However, in that case tSNE isn’t a good choice. You could try UMAP for such data.
Why do you say UMAP is better for data with continuous variation than t-SNE? I have a lot of experience with both, but haven't seen or read anything to indicate this.
I’m pretty surprised that you didn’t, the preservation of that kind of structure is one of its main selling points. t-SNE rips things apart, UMAP doesn’t. Here’s the first google hit for “umap vs tsne”, which says
[…] notably highlighting faster runtime and consistency, meaningful organization of cell clusters and preservation of continuums in UMAP compared to t-SNE.
Hi! I think for what you’re doing, you might consider a community detection algorithm on the high dimensional data (or a bunch of PCAs/ICAs to spped things up).
They serve us very well in the computational biology world, much better than the primitive k-means or dbscan.
Until recently, we used louvain community detection, but the author of the python package recently published an improved version: https://github.com/vtraag/leidenalg