nshdesai / Linear-Regression

Linear Regression using Gradient Descent
2 stars 0 forks source link

Using scipy's genetic algorithm for initial parameter estimation in gradient descent #2

Closed zunzun closed 7 years ago

zunzun commented 7 years ago

I see you are writing Python code for gradient descent optimization. The authors of scipy have added a genetic algorithm for initial parameter estimation for use in gradient descent. The module is named scipy.optimize.differential_evolution.

I have used scipy's Differential Evolution genetic algorithm to determine initial parameters for fitting a double Lorentzian peak equation to Raman spectroscopy of carbon nanotubes and found that the results were excellent. The GitHub project, with a test spectroscopy data file, is:

https://github.com/zunzun/RamanSpectroscopyFit

If you have any questions, please let me know. My background is in nuclear engineering and industrial radiation physics, and I love Python, so I will be glad to help.

nshdesai commented 7 years ago

Hi, I really appreciate you taking the time to check out my repo. Just to be clear the differential evolution algorithm is just so that we can optimize gradient descent even more by predicting the initial values right?

Also, I would love to know the right way to integrate it with the existing Code.

One more thing: though this will give a better generalization for this dataset since it has two peaks, won't it take away the point of "linear" regression. P.S Will do it anyway

zunzun commented 7 years ago

Three questions, three answers:

1) in scipy.optimize the Differential Evolution genetic algorithm is used for initial parameter estimation, yes. In your code, the c and m parameters are initially set to 0.0 - and because your equation is a simple straight line that should work OK with gradient descent as the error space is simple. With more complex equations, error space is not so simple and the choice of starting parameters becomes more important. In the Double Lorentzian equation of my Raman spectroscopy example the fit is very, very sensitive to the initial parameters because the equation is more complex with eight separate parameters to be fitted - causing error space to be much more convoluted and so gradient descent is much more likely to get stuck in a local error minimum unless good starting parameters are used.

2) To use my code, replace the example data and equation with your own data and equation. My code is a working example and will not be best solutions for your specific problem, it is intended to illustrate using the scipy genetic algorithm for fitting a particular equation - and it does that quite well for my problem.

3) Linear regression has the name "linear" not because of the specific equation used, but because of the type of equation that can be fitted. For example, a second order polynomial (quadratic) equation:

y = a + bX + cX^2

can be fitted with linear algebra (the "linear" in "linear regression") to find parameters a, b, and c. Using the same code for linear regression that can fit the above quadratic equation, you could also fit:

y = aSin(X) + bLog(X) + c*X^5

Notice that the parameters are simple multipliers on some function such as sin(X) or X^2, so that the linear algebra used is the same for both equations. However, a non-linear equations such as:

y = a* X^b + c

can not be solved using linear algebra, because the parameter "b" is not a simple multiplier, but a power - a non-linear solver such as gradient descent must be used in this case. If good starting values are not used to begin the "descent" down to lower error, the solver can stop at a low but not minimum error. Different non-linear solvers such as Levenberg-Marquardt were developed to mitigate this problem - but with bad starting parameters will usually cause problems for even slightly complex equations.

I tried to make my answers clear without being too lengthy (I hope).

James

nshdesai commented 7 years ago

Hi, I have added this feature into the code. Though it doesn't fit as well as it does for your data set, to prevent overfitting is best to leave it this way. If you have any other suggestions please let me know and if you think that this is a sufficient addition then we can close this issue

zunzun commented 7 years ago

Your new code is most excellent.