Open hadisfr opened 3 years ago
Using 128 dimensions and then using UMAP to reduce the result to x
and y
, I ended up this:
Is this a right approach? Can I make it better?
Did you calculate the modularity of Louvain algorithm and of, say, k-means on the embedding? Are they comparable?
No. How can I do that? Feed the final bidimensional result of embedding to sklearn
or sth? 🤔
P.S. I saw many times this approach of feeding higher dimensional embeddings of VERSE or node2vec into UMAP to get a bidimensional embedding for visualization, and it seems to work better than using e.g. VERSE to get a bidimensional embedding directly. But I don't get it. Aren't UMAP another embedding tool just liker VERSE and node2vec, only with a different approach?
I would feed 128d embeddings personally.
Regarding 2d vs. 128d embeddings, the objective functions of UMAP or TSNE are tailored towards visualization task. VERSE is a bit different, offering similarity preservation for analysis of graphs.
I'll test that later this way. 🤔
Different approaches to design objective functions is an important point. I did not dig too into UMAP. Thank you!
Hi! I tried to use VERSE to visualize a not-so-large (nv: 23463, ne: 35923) well-clustered graph. I used PPR version with
--dim 2
(Total steps (mil): 2346.3) and then used two dimensions asx
andy
(after normalization) and pre-calculated cluster IDs (Louvain method) as colour to visualize the embedded graph. I ended up this: While I was expecting a visualization in which all clusters separated perfectly, as in example shown in your article. Any idea about which config should I use or what was wrong with my procedure?