ixxi-dante / an2vec

Bringing node2vec and word2vec together for cool stuff
GNU General Public License v3.0
22 stars 6 forks source link

Minibatch sensitivity analysis #29

Closed wehlutyk closed 6 years ago

wehlutyk commented 6 years ago

Work branched out from #19.

wehlutyk commented 6 years ago

This is currently running in my session on grunch, using the projects/behaviour/minibatch-parameters.py script.

wehlutyk commented 6 years ago

https://github.com/ixxi-dante/nw2vec/blob/74a748c8df91e68410cf60004e7793722bb022cb/projects/behaviour/minibatch-parameters.py#L129 is a mistake in what is currently running on grunch: it cancels out the noise on the node labels. Not a huge problem, as we'll learn from still nonetheless, but we should:

wehlutyk commented 6 years ago

Paused to look at results in #43.

wehlutyk commented 6 years ago

Resumed.

wehlutyk commented 6 years ago

Changes and fixes that have happened in the BlogCatalog exploration since this computation started:

In short: u removed, Bernoulli decoder used, optionally no adj loss scaling. So we could do better in terms of performance, but the results should still be true-ish.

TODO: once this finishes, re-run one short training with the above fixes to see if it changes much or not.

wehlutyk commented 6 years ago

Paused for explorations in #48

wehlutyk commented 6 years ago

Resumed.

wehlutyk commented 6 years ago

Finished running.

wehlutyk commented 6 years ago

Results notebook in b5fbb1d5d2cd26e9e785d5df8ffbe78146591af3, see projects/behaviour/minibatch-parameters-results.ipynb. Here is the final grid:

Minibatch sensitivity analysis

wehlutyk commented 6 years ago

Launched a second run with the changes mentioned above, with 500 training epochs and a network of size 1000 (20 communities of size 50), instead of 2000 (20 x 100).

wehlutyk commented 6 years ago

Relaunched with dims = (20, 25, 25) to see if we get proper training results. If not, check what worked in #48.

wehlutyk commented 6 years ago

(Still waiting for the above run to finish, 11h left.)

It looks like the current minibatch strategy is not good: I'm not seeing good results on trainings with synthetic graphs, and it also doesn't make that much sense with the "sample the dataset uniformly according to variables of interest" strategy (except by chance).

I feel that:

So:

wehlutyk commented 6 years ago

Results for the latest tests (500 training epochs, 1000 = 20 x 50 nodes, fixes from above) are in 39fee0fc55ad4f586ea30961f8cf4166a84a32da, in the projects/behaviour/minibatch-parameters-issue_48=fixed-results.ipynb notebook.

The bottom line is that in 500 minibatch epochs we get abysmal prediction performance, when for a RW length of 100 that means the equivalent of 50000 fullbatch epochs.

See pictures here with dims = (20, 10, 2):

minibatch analysis dims = (20, 10, 2)

and with dims = (20, 25, 25):

minibatch analysis dims = (20, 25, 25)

So I'm going on with the roadmap of the previous comment.