For general non linear optimization using the MSE loss will train the parameters un an unbiased manner.
say my data is Y and the non linear function is parametrized as f(A,x)
My question: if added a L1 penalty to the non-linear optimization:
LOSS = MSE(Y , f(A,x)) + lambda*||A|| _1
Can I treat this as a regularized optimization and torch optimizer using ADAM, for example, will track the solution accordingly?
Also, will this induce a true zero on the parameter estimation or should I apply specific as like a projected optimization?
For general non linear optimization using the MSE loss will train the parameters un an unbiased manner. say my data is Y and the non linear function is parametrized as f(A,x)
My question: if added a L1 penalty to the non-linear optimization: LOSS = MSE(Y , f(A,x)) + lambda*||A|| _1
Can I treat this as a regularized optimization and torch optimizer using ADAM, for example, will track the solution accordingly? Also, will this induce a true zero on the parameter estimation or should I apply specific as like a projected optimization?
Thanks