Closed wehlutyk closed 6 years ago
This is currently running in my session on grunch
, using the projects/behaviour/minibatch-parameters.py script.
https://github.com/ixxi-dante/nw2vec/blob/74a748c8df91e68410cf60004e7793722bb022cb/projects/behaviour/minibatch-parameters.py#L129 is a mistake in what is currently running on grunch
: it cancels out the noise on the node labels. Not a huge problem, as we'll learn from still nonetheless, but we should:
...-features_noise_scale=0-...
Paused to look at results in #43.
Resumed.
Changes and fixes that have happened in the BlogCatalog exploration since this computation started:
u
removed in embedding parametrisation (improved performance)nx.planted_partition_network
which sorts the nodes)labels
vs. features
in target function from #48) that has no incidence on the current run, since the noise on the features is 0 and this variable has not been scaled. This is fixed in 7d5cc4eeb48841815283f20576b76802284e4819.Bernoulli
decoder for binary features (which these are)In short: u
removed, Bernoulli
decoder used, optionally no adj loss scaling. So we could do better in terms of performance, but the results should still be true-ish.
TODO: once this finishes, re-run one short training with the above fixes to see if it changes much or not.
Paused for explorations in #48
Resumed.
Finished running.
Results notebook in b5fbb1d5d2cd26e9e785d5df8ffbe78146591af3, see projects/behaviour/minibatch-parameters-results.ipynb. Here is the final grid:
Launched a second run with the changes mentioned above, with 500 training epochs and a network of size 1000 (20 communities of size 50), instead of 2000 (20 x 100).
Relaunched with dims = (20, 25, 25)
to see if we get proper training results. If not, check what worked in #48.
(Still waiting for the above run to finish, 11h left.)
It looks like the current minibatch strategy is not good: I'm not seeing good results on trainings with synthetic graphs, and it also doesn't make that much sense with the "sample the dataset uniformly according to variables of interest" strategy (except by chance).
I feel that:
So:
Results for the latest tests (500 training epochs, 1000 = 20 x 50
nodes, fixes from above) are in 39fee0fc55ad4f586ea30961f8cf4166a84a32da, in the projects/behaviour/minibatch-parameters-issue_48=fixed-results.ipynb notebook.
The bottom line is that in 500 minibatch epochs we get abysmal prediction performance, when for a RW length of 100 that means the equivalent of 50000 fullbatch epochs.
See pictures here with dims = (20, 10, 2)
:
and with dims = (20, 25, 25)
:
So I'm going on with the roadmap of the previous comment.
Work branched out from #19.