Dropout wasn't great in terms of regularizing cohorts. Fitting a normal distribution to the beta variables instead.
Was a bit hard to get this to converge, for some reason it doesn't work well when I tried to fit everything as a part of the minibatches... had to break it out to run after the minibatches and now it converges well. I think the reason is otherwise sigma goes to zero very quickly early on, and then it's very hard to get out of that place.
Also Hound went nuts on a bunch of old stuff so I reformatted some of it. PR got a bit bigger because of this.
Coverage increased (+0.1%) to 94.35% when pulling bd55b59987bac9f9c1be33d38d075ef705c2cb60 on no-dropout into 6dc9e834ce58a6fb9ac963825ea94434b925e73f on master.
Dropout wasn't great in terms of regularizing cohorts. Fitting a normal distribution to the beta variables instead.
Was a bit hard to get this to converge, for some reason it doesn't work well when I tried to fit everything as a part of the minibatches... had to break it out to run after the minibatches and now it converges well. I think the reason is otherwise sigma goes to zero very quickly early on, and then it's very hard to get out of that place.
Also Hound went nuts on a bunch of old stuff so I reformatted some of it. PR got a bit bigger because of this.