accord-net / framework

Machine learning, computer vision, statistics and general scientific computing for .NET
http://accord-framework.net
GNU Lesser General Public License v2.1
4.49k stars 1.99k forks source link

How do I calculate one of sse, mse, or rmse of multivariate linear regression using the API of version 3.8.0? #1106

Open Soldalma opened 6 years ago

Soldalma commented 6 years ago

What would you like to submit? (put an 'x' inside the bracket that applies)

Issue description

The code below calculates a standard error statistic using Accord.NET version 3.8.0. I would like to calculate one of Sum of Squared Errors, Mean Squared Error, or Root Mean Squared Error. How do I do that?

open Accord.Statistics.Models.Regression.Linear let ols = OrdinaryLeastSquares() let regression = ols.Learn(xs, y) let stderr = regression.GetStandardError(xs, y)

Observation: The book Mastering .NET Machine Learning by Jamie Dixon shows how to obtain these three statistics with Accord.NET version 3.0.2 but the API changed considerably since then.

I also asked this question on SO: https://stackoverflow.com/questions/47723014/how-does-one-calculate-mean-squared-error-for-a-multivarite-linear-regression-us

cesarsouza commented 6 years ago

Hi @Soldalma,

Thanks for opening the issue! Yes, unfortunately the API changed quite a bit during the transition to the .Learn() API during the last year. You can compute the other kinds of errors using the different classes in the Accord.Math.Optimization.Losses namespace.

For example, let's say you have already learned a regression instance stored in the variable regression, the input data in inputs and outputs in outputs. You can use:

// Compute the predicted points using
double[] predicted = regression.Transform(inputs);

// Now we can compute diverse error metrics using, e.g.:
double mse = new SquareLoss(outputs).Loss(predicted);

double sse = new SquareLoss(outputs) 
{
    Mean = false
}.Loss(predicted);

double rmse = new SquareLoss(outputs)
{
    Mean = true,
    Root = true
}.Loss(predicted);

// coefficient of determination r²
double r2 = new RSquaredLoss(numberOfInputs: 2, expected: outputs).Loss(predicted); 

// adjusted or weighted versions of r² using
var r2loss = new RSquaredLoss(numberOfInputs: 2, expected: outputs)
{
    Adjust = true,
    // Weights = weights; // (if you have a weighted problem)
};

The documentation for the SquareLoss is available here. There are further examples with different metrics also here.

By the way, all examples given in Jamie Dixon's book should still be working, but maybe some might be using methods currently marked as obsolete. If you have found examples that do not work anymore, please let me know so I can make sure they keep working for the foreseeable future.

Hope it helps, Cesar