Open Pavlikm85 opened 6 years ago
I am curious to know if progress has been made on this issue as I also feel the regularisation is not working properly. I started with a small training set (10%) , no regularisation and a network with 3 layers of 8 neurons each in order to encourage overfitting. ThenI I started increasing the regularisation parameter but not much is happening. Then the optimisation gets stuck for regularization parameters values say above 0.03. I am also curious how you can override the default settings with negative values with the current interface. Olivier
I've gotten regularization to reliably work at typical values by making a small tweak to allow changing the regularization hyperparams on-the-fly (forked playground env). Starting the training with 0 regularization and then turning on L1/L2 reg after a short warm-up period (say, ~5-100 epochs depending on other hypers) works very well. I'm not sure why this is the case, but I suspect a combination of sub-optimal initialization as well as regularization being generally more useful after a short warm-up period. As an heuristic, regularization seems to work best after letting the network fit the data first (overfitting is no problem, then after turning on L1/L2 reg, you can observe overfitting diminish - especially salient with L1 "killing" redundant nodes). I've just submitted PR here - hopefully it will be added to this official repo soon!
I know it's weird and that somebody should have noticed, but regularization only seems to help when I override the defaut settings and use negative values. With negative values, regularization indeed does seem to make the network work better most of the time, (this is even more noticeable with deeper networks) but with any but the lowest positive values, the neural network usually malfunctions and fails to learn.