ixxi-dante / an2vec

Bringing node2vec and word2vec together for cool stuff
GNU General Public License v3.0
22 stars 6 forks source link

Train BlogCatalog with higher embedding dimension #31

Closed wehlutyk closed 6 years ago

wehlutyk commented 6 years ago

To see if that's the limiting factor in why the embeddings currently don't look good at all. (See here.)

wehlutyk commented 6 years ago

Currently running in my session on grunch, using https://github.com/ixxi-dante/nw2vec/blob/master/projects/scale/blogcatalog.py with dim_ξ = 10.

wehlutyk commented 6 years ago

Training is done, must look at the results now.

wehlutyk commented 6 years ago

Results are in 1f4c4106daf838033dc4e7dae8c7d0ed1f980d35, see the projects/scale/blogcatalog-dim_ξ=10-results.ipynb notebook.

Highlights (see in the figures below, extracted from the notebook):

Training history

Embedding scatter plots for all couples of dimensions

Adjacency reconstruction

wehlutyk commented 6 years ago

Now:

I don't think looking at higher dimensions will give anything else. Instead, there are (at least) two other points to check:

And combinations of those two. If all that fails to explain the bad adjacency reconstruction, then the answers will be found by working on the behaviour project: #30 and #32 mainly.

Closing this issue in favour of all the above.