Closed zorian15 closed 2 years ago
I've implemented testing to make sure this is doing the right thing. But ultimately, the problem here is because we're not allowing for bias parameters (by default) in the non-linearity. We then constrain positive weights, and in this case, a ReLU activation. Meaning the network can only predict positive values. It's best solution, then is to have negative latent phenotypes, which get floored to exactly 0 in the activation. I'm working on a PR, but the solution will essentially be to not allow for any activation which cannot be negative.
One nice activation in place of ReLU would be PReLU
I've noticed that whenever we impose some sort of constraint on monotonicity of the
g
functions intorchdms
, our models always fail to train (at least in every instance I've tried to do such) -- not sure why this is, but raising an issue so I don't forget to come back to it at a better time.I ran a quick experiment on Tyler's RBD data (which I had to re-prep on
/fh/fast/
due to some things added totorchdms
since then that broke things), results can be found here.Some examples though --
Without monotonic constraints on
g
:With monotonic constraints on
g
:NOTE: I only trained these models for 100 epochs and no independent starts and a very small
g()
, as the purpose was to see if things broke when a monotonic constraint was added.