Open shoutOutYangJie opened 2 years ago
I don't understand the function of "get_custom_L2". wait for your reply.
what difference between directly using weight decay in the optimizer argument?
I don't understand the function of "get_custom_L2". wait for your reply.