Closed lizhenstat closed 5 years ago
Hi, I noticed that the paper uses proximal gradient descent to solve the optimization problem. However the neural network is already a non-convex problem, why can't we just use the normal algorithmn (adam, etc) to solve this issue?
Thanks in advance
@jaehong-yoon93 Looking forward to your reply
I asked one of my classmates and he says that if we do not use the optimization method(like proximal gradient descent), we can't achieve the sparsity.
Hi, I noticed that the paper uses proximal gradient descent to solve the optimization problem. However the neural network is already a non-convex problem, why can't we just use the normal algorithmn (adam, etc) to solve this issue?
Thanks in advance