`optimisation` doesn't do optimization

amitkgupta commented 9 years ago

Call me crazy, but BatchGradientDescent doesn't find the min, nor the argmin, as the GradientDescent part of the function name and the optimisation package name would suggest. It also doesn't do parameter estimation as the tests suggest.

I actually don't know what "parameter estimation" even means in this context, but I'm guessing that, assuming y = a1*x1 + a2*x2 + ... + an*xn (dot product), it attempts to figure out that the linear coefficient parameters a1, ..., an are, given a bunch of observed values y and observed tuples (x1, ..., xn).

Can someone enlighten me on what this code does and/or is supposed to do?

Sentimentron commented 9 years ago

So I did a quick double-check on paper, and it does seem like it's doing what it's supposed to.

Could you elaborate on any weird behaviour you're observing?

amitkgupta commented 9 years ago

What is it supposed to do first of all? Find the argmin of a function?

So if y = (x0-7)^2 + (x1+3)^2

And you give it a bunch of data like

X= 1 2 3 0 -1 2 etc

Y= 61 25 89 etc

And a reasonable starting guess theta, it should return something like

7 -3

Correct?

On Saturday, August 30, 2014, Richard Townsend notifications@github.com wrote:

So I did a quick double-check on paper https://github.com/sjwhitworth/golearn/wiki/Gradient-Descent, and it does seem like it's doing what it's supposed to.

Could you elaborate on any weird behaviour you're observing?

— Reply to this email directly or view it on GitHub https://github.com/sjwhitworth/golearn/issues/81#issuecomment-53970736.

Sentimentron commented 9 years ago

It seems to be doing general matrix multiplication.

-----Original Message----- From: "Amit Gupta" notifications@github.com Sent: ‎30/‎08/‎2014 22:43 To: "sjwhitworth/golearn" golearn@noreply.github.com Cc: "Richard Townsend" richard@sentimentron.co.uk Subject: Re: [golearn] optimisation doesn't do optimization (#81)

What is it supposed to do first of all? Find the argmin of a function?

So if y = (x0-7)^2 + (x1+3)^2

And you give it a bunch of data like

X= 1 2 3 0 -1 2 etc

Y= 61 25 89 etc

And a reasonable starting guess theta, it should return something like

7 -3

Correct?

On Saturday, August 30, 2014, Richard Townsend notifications@github.com wrote:

So I did a quick double-check on paper https://github.com/sjwhitworth/golearn/wiki/Gradient-Descent, and it does seem like it's doing what it's supposed to.

Could you elaborate on any weird behaviour you're observing?

— Reply to this email directly or view it on GitHub https://github.com/sjwhitworth/golearn/issues/81#issuecomment-53970736.

— Reply to this email directly or view it on GitHub.

amitkgupta commented 9 years ago

To what end?

On Sat, Aug 30, 2014 at 11:05 PM, Richard Townsend <notifications@github.com

wrote:

It seems to be doing general matrix multiplication.

-----Original Message----- From: "Amit Gupta" notifications@github.com Sent: ‎30/‎08/‎2014 22:43 To: "sjwhitworth/golearn" golearn@noreply.github.com Cc: "Richard Townsend" richard@sentimentron.co.uk Subject: Re: [golearn] optimisation doesn't do optimization (#81)

What is it supposed to do first of all? Find the argmin of a function?

So if y = (x0-7)^2 + (x1+3)^2

And you give it a bunch of data like

X= 1 2 3 0 -1 2 etc

Y= 61 25 89 etc

And a reasonable starting guess theta, it should return something like

7 -3

Correct?

On Saturday, August 30, 2014, Richard Townsend notifications@github.com wrote:

So I did a quick double-check on paper https://github.com/sjwhitworth/golearn/wiki/Gradient-Descent, and it does seem like it's doing what it's supposed to.

Could you elaborate on any weird behaviour you're observing?

— Reply to this email directly or view it on GitHub https://github.com/sjwhitworth/golearn/issues/81#issuecomment-53970736.

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/sjwhitworth/golearn/issues/81#issuecomment-53979123.

Sentimentron commented 9 years ago

Each row of y is a linear combination of a row of x and both theta values, so it's figuring out how the values of theta should change to get as close to the answer as possible.

So x = [[1, 3], [5, 8]], theta = [2, 2](after termination), and y = [8, 26] = [[1, 3], [5, 8]] * [2, 2].

amitkgupta commented 9 years ago

Ah, so it's not general gradient descent, it's a specific application of gradient descent (stochastic gradient descent) to estimate the parameters of a function y = f(x, θ) by minimizing a certain loss function. It's actually a very special case of that, it's only for when the function f is just the dot product: y = x·θ (+ some noise).

That can easily be adapted to train a least-squares linear regression model, so that's nice, but I wonder if it's still overly specific? I suppose that's not an issue per se, but based on the names of the package and functions I definitely wouldn't have expected it to be so specific.

Sentimentron commented 9 years ago

I would agree, but is specifically a golearn issue, or more of an issue for a more general go-based numerical computing library?

sjwhitworth / golearn

`optimisation` doesn't do optimization #81