Yufeng-shen / TJ2GBE

Reconstruct grain boundary energy from triple junction geometries
BSD 3-Clause "New" or "Revised" License
1 stars 2 forks source link

Minimum Regularization Strength (lambda) #9

Closed sgbaird closed 3 years ago

sgbaird commented 3 years ago

Hi Yufeng,

The paper's appendix talks about how for a large enough value of lambda, the "optimization problem approximately becomes:" , and that using a smaller lambda can help with noisier datasets. Any comments on a minimum regularization strength that can be used without the approximation breaking down? For example, would e.g. 0.01, 0.1, 1, etc. be considered "too small"?

Sterling

sgbaird commented 3 years ago

Still curious if you have thoughts on this

Yufeng-shen commented 3 years ago

Hi Sterling,

Sorry I missed your previous comment. Just as other regularized optimization problems, It is very hard to say how to choose the right regularization parameter lambda.

Usually, a good practice is to choose lambda so that the first term ( |C X|^2 ) and second term ( |lambda B X|^2 ) have similar magnitudes when the objective function is at minimum. Another way is to choose the lambda so that, after minimization, the magnitude of |B X|^2 is close to your estimation of experimental error.

As you can imagine, the first approach is easier than the second, and it is used in my code. I remember it is approximately the number of C rows divide by the number of B rows. In fact, using a larger lambda doesn't change the result a lot in my testing, I guess the experiment error is averaged out somehow.

In conclusion, 0.01,0.1,1 would be too small (I don't think your B would have more rows than C). I recommend using a larger lambda, it doesn't show problems in my test.

Yufeng

Yufeng-shen commented 3 years ago

BTW, I don't think there is a single correct answer regarding to how to choose the hyperparameters in a model. It can be a research topic by itself.

sgbaird commented 3 years ago

Great. Thanks for this. I was having trouble with 5DOF interpolation results on some experimental data. I also agree, choosing hyperparameters can get pretty tricky. This gives me some great info to go off of. Thank you.