Different initial means (input.master) with minibatch and num-instances

EricR86 commented 8 years ago

Original report (BitBucket issue) by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).

It's possible to run simpleseg training with the following command:

SEGWAY_RAND_SEED=1498730685 segway "$cluster_arg" \
    --num-instances=4 \
    --minibatch-frac=0.1 \
    --split-sequences=1000 \
    --max-train-rounds=5 \
    --include-coords="../include-coords.bed" \
    --tracks-from="../tracks.txt" --num-labels=4 \
    train "../simpleseg.genomedata" traindir

and run it twice and get different initial means in input.master.

$ diff test-20160823.z8hcsD/traindir/params/input.master test-20160823.9C9nVr/traindir/params/input.master
156,163c156,163
< 0 mean_seg0_subseg0_testtrack1 1 0.571728495846
< 1 mean_seg0_subseg0_testtrack2 1 0.482009675214
< 2 mean_seg1_subseg0_testtrack1 1 0.393569676591
< 3 mean_seg1_subseg0_testtrack2 1 0.524263233453
< 4 mean_seg2_subseg0_testtrack1 1 0.539103684627
< 5 mean_seg2_subseg0_testtrack2 1 0.452877599219
< 6 mean_seg3_subseg0_testtrack1 1 0.508957638027
< 7 mean_seg3_subseg0_testtrack2 1 0.57080277882
---
> 0 mean_seg0_subseg0_testtrack1 1 0.549077230446
> 1 mean_seg0_subseg0_testtrack2 1 0.494353579829
> 2 mean_seg1_subseg0_testtrack1 1 0.572589020884
> 3 mean_seg1_subseg0_testtrack2 1 0.555109062568
> 4 mean_seg2_subseg0_testtrack1 1 0.444562359251
> 5 mean_seg2_subseg0_testtrack2 1 0.460819726119
> 6 mean_seg3_subseg0_testtrack1 1 0.475074245946
> 7 mean_seg3_subseg0_testtrack2 1 0.470233877591

$ env | grep SEGWAY
SEGWAY_NUM_LOCAL_JOBS=2

This is a common use case and needs to be fixed

EricR86 commented 8 years ago

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).

Edited issue description

EricR86 commented 8 years ago

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).

This might be an incorrectly flagged issue. The same test directories report that the input.0.master are equivalent.

And from the docs:

input.*.master - generated hyperparameters and starting parameters
input.master - best set of hyperparameters and starting parameters

EDIT: This definitely an issue, the input.*.masters diverge later

EricR86 commented 8 years ago

Original comment by Rachel Chan (Bitbucket: rcwchan).

@ericr86 Are all input.*.masters equivalent?

It reads to me as if input.master is actually just the best input.*.master rather than something that is randomly generated separately. So if there was a race condition with the generation of the later input.*.masters and one of them turned out to be the 'best' one, then we would end up with different results.

EricR86 commented 8 years ago

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).

Later input.*.masters do differ and not just by instance number, but the sets of means diverge signficantly.

EricR86 commented 8 years ago

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).

changed state from "new" to "resolved"

Resolved in Pull Request #61

hoffmangroup / segway

Different initial means (input.master) with minibatch and num-instances #81