Open wangq95 opened 4 years ago
Hi Qiansheng,
The paper points out that a stronger regularization in the inner objective, i.e. the training loss helps overcome the failure modes of DARTS. Therefore, you can just change the --weight_decay
argument to the specified value in the paper to achieve similar performance. To run the adaptive L2 algorithm, you should set --early_stop=3
. In this case the --weight_decay
will indicate the initial starting value of the L2 factor. Afterwards, if the algorithm tracks the dominant eigenvalue of the Hessian and increases this L2 factor accordingly.
Hope this was helpful. Best, Arber
Hi, I'm sorry that I can't find the L2 regularization mentioned in your paper, could you help me to point out where to use the increased regularization during searching? Thank a lot!