SebChw / Actually-Robust-Training

Actually Robust Training - Tool Inspired by Andrej Karpathy "Recipe for training neural networks". It allows you to decompose your Deep Learning pipeline into modular and insightful "Steps". Additionally it has many features for testing and debugging neural nets.
MIT License
44 stars 0 forks source link

Implement init well #194

Open kordc opened 1 year ago

kordc commented 1 year ago

From #72 : init well. Initialize the final layer weights correctly. E.g. if you are regressing some values that have a mean of 50 then initialize the final bias to 50. If you have an imbalanced dataset of a ratio of 1:10 of positives: negatives, set the bias on your logits such that your network predicts a probability of 0.1 at initialization. Setting these correctly will speed up convergence and eliminate “hockey stick” loss curves where in the first few iterations your network is basically just learning the bias.

this should be model modifier - or even some necessary function to implement