Open mkuennek opened 7 years ago
Hi @mkuennek,
Do you mean Polynomial regression with multiple independent variables, i.e. multiple inputs and one single output?
If yes, this should be simple to do. You can either transform your inputs to become polynomials by considering all their possible combinations of a particular degree, or maybe you can repurpose the static Transform method of the Polynomial kernel class to transform your inputs for you. After you transform your inputs to this new polynomial space, then you should be able to apply the usual OrdinaryLeastSquares to obtain a MultipleLinearRegression that would be equivalent to a possible MultiplePolynomialRegression
.
An example would be:
// Let's say your current inputs and outputs are in the variables x and y below:
double[][] x = ...
double[] y = ...
// First, transform your inputs to polynomial space
double[][] z = Polynomial.Transform(x, degree: 2, constant: 0);
// Now, create an usual OLS algorithm
var ols = new OrdinaryLeastSquares()
{
UseIntercept = true
};
// Use the algorithm to learn a multiple regression
MultipleLinearRegression regression = ols.Learn(z, y);
// Check the quality of the regression:
double[] prediction = regression.Transform(z);
double error = new SquareLoss(expected: outputs).Loss(actual: prediction);
However, you might want to compare it out against some other implementation or textbook example just to make sure the results indeed match. It is possible the use of constants in the code above (i.e. in both Polynomial and OrdinaryLeastSquares) might have to be adjusted so the constant is not applied twice.
Hope it helps, Cesar
Hi @mkuennek,
Would you be able to give a bit more detail on what problem you are trying to solve? I am not completely clear from the title/body whether you are trying to solve a multiple linear regression problem or a polynomial regression or possibly something else.
If it's one of the first two, I might be able to post a code snippet to help you out. If it's something else, I might need to defer to someone more knowledgeable than me!
Thanks Alex
@cesarsouza too fast... ;)
Hi @cesarsouza and @AlexJCross,
Sorry for not beeign more specific. But as you already correctly guessed I have a supervised learning problem with multiple independent input variables and one output variable and I want to try out Polynomial regression. So similar to multivariate linear regression but with a polynomial.
The approach proposed by @cesarsouza looks very interesting. I will try that. Thanks for the help!
Greetings, Michael
I tried the proposed approach and while it looks nice in theory, it does not perform well in practice. The reason is that by transforming the inputs into into the polynomial space, the number of independent variables grows exponentially with the degree of the polynomial. This gets out of hand fast so that already for a polynomial of degree 4 in my case, the regression algorithm took very long to predict values (I stopped after some minutes).
Hi @mkuennek,
Thanks for the feedback! This is actually the way a polynomial regression would normally be computed (i.e. in sklearn you would use PolynomialFeatures to transfer the data first and then use and a LinearRegression to fit them).
However, there are some other things we could try:
But by the way, there is also a part that maybe I didn't get right from your last post: did you mention that the code became too slow to predict values? Or do you mean the model got too slow to train? While it is understandable that the model would take longer to train given the combinatoric explosion in the number of inputs, the evaluation time should certainly not take that long.
How many rows (samples) and columns (dimensions) do you have in your problem?
Regards, Cesar
Hi @cesarsouza ,
Thanks for the clarification! I might check the approach from your second bullet point. Nevertheless, it was just for trying out PolynomialRegression. In the end I will probably use Neural Networks, as they provided really good results in my case.
Sorry, the formulation was not very specific. The code became slow during the learning part and not when predicting. My data contained 10 columns and about 200 rows. So the approach is probably not suitable for my problem and I should use neural networks instead.
Thanks for the help, again!
Greetings, Michael
What would you like to submit? (put an 'x' inside the bracket that applies)
Issue description
Hi,
Are there any plans of adding multiple polynomial regression to the framework? I am currently investigating which regression model to use for my thesis and multiple polynomial regression is missing. I would implement it myself but my knowledge about statistics and machine learning is not the best.
Greetings
P.S.: Fixed title