dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
8.93k stars 1.86k forks source link

Clarifications Regarding ForeCastBySsa #6720

Open superichmann opened 1 year ago

superichmann commented 1 year ago

Note: This is not a direct feature request but some questions, feature requests can be derived from those. I am posting them here since it is the only place I know where actual experienced contributors will answer.

  1. How can I call the functionality of mlnet forecasting from a C# codefile (something like RegressionExperiment but for ForeCastBySsa).
  2. How can I pass several features (columns) to ForeCastBySsa? do I just put them in the IDataView when I call Fit? what do I specify in the inputColumnName parameter? I have tried adding more columns to the IDataView but the results are exactly the same as without the new columns
  3. How do I specify which columns are which type in the same manner we do in ColumnInformation? Categorical, etc.
  4. Where can I find detailed non scientific documentation on what is the functionality of each parameter in ForeCastBySsa and how is it actually affects the prediction. I have already searched the official documentation, All-Samples and the source code of TimeSeries class but couldn't find detailed data. Thanks!

Any help is appreciated :)

LittleLittleCloud commented 1 year ago

Let's break down one by one

How can I call the functionality of mlnet forecasting from a C# codefile (something like RegressionExperiment but for ForeCastBySsa).

You can use Process.start to run mlnet as a seperate process. However that's not suggested. We currently don't have an end to end automl experiment for forecasting and you probably need to create your own forecasting AutoMLExperiment. Here's an example of how to do it Forecasting with Luna

How can I pass several features (columns) to ForeCastBySsa? do I just put them in the IDataView when I call Fit? what do I specify in the inputColumnName parameter? I have tried adding more columns to the IDataView but the results are exactly the same as without the new columns

ForecastBySSA is a univariant forecasting trainer, which means it only accepts a single numeric column as feature input. If you want to use several features instead, you can probably transform your forecasting problem into a regression task first, and then you should be able to pass multiple features into regressor Here's an example of how to use autoregression to resolve a forecasting problem Forecasting with Luna and autoregression

How do I specify which columns are which type in the same manner we do in ColumnInformation? Categorical, etc.

see examples above

Where can I find detailed non scientific documentation on what is the functionality of each parameter in ForeCastBySsa and how is it actually affects the prediction. I have already searched the official documentation, All-Samples and the source code of TimeSeries class but couldn't find detailed data.

See examples above

Thanks!

And you are welcome. And if you can create an issue to ask for an AutoML forecasting experiment that would be great. As it will help us shape out the delivery for next release

superichmann commented 1 year ago

Thanks! I have tried your notebooks but couldn't make the parameter finding process automatic. I will post here again with the details

superichmann commented 1 year ago

@LittleLittleCloud Hi again! On the example code, why do you call Predict twice? isn't that moving forward the prediction engine twice?

    // firstly, get next n predict where n is horizon
    var predict = predictEngine.Predict();

    predictLoads1H.Add(predict.Predict[0]);

    // update model with truth value
    predictEngine.Predict(new ForecastInput()
    {
        Load = load,
    });

Why not just call Fit and Transform and then parse the IDataView?

LittleLittleCloud commented 1 year ago

@superichmann It's because forecasting model can only predict the next value at t. To make the model predict t+1, you need to update the model with value at time t

superichmann commented 1 year ago

@LittleLittleCloud Thanks. and what do you think about this way of evaluating for one forecast at a time? I get "too good" results from this

class MyDataForecast
{
    public float Truth { get; set; }
    public float[] Forecast { get; set; }
}
var xmodel = X.Forecasting.ForecastBySsa("Forecast", "Truth", 2, 5, 88, 1);
var xtransformer = xmodel.Fit(trainData);
var transformed = xtransformer.Transform(TestData);
var forecastData = X.Data.CreateEnumerable<MyDataForecast>(transformed, reuseRowObject: false)
.Select(row => new
{
    Truth= row.Truth,
    Forecast = row.Forecast[0]
});
var forecastDataView = X.Data.LoadFromEnumerable(forecastData);
var metrics = X.Regression.Evaluate(forecastDataView, labelColumnName: "Truth", scoreColumnName: "Forecast");
Console.WriteLine(metrics.MeanAbsoluteError);