dotnet / csharp-notebooks

Get started learning C# with C# notebooks powered by .NET Interactive and VS Code.
MIT License
1.03k stars 344 forks source link

ML Notebook Enhancements #68

Open luisquintanilla opened 2 years ago

luisquintanilla commented 2 years ago

Example

Original

var context =new MLContext(seed: 1);
var pipeline = context.Transforms.Concatenate("Features", "X")
  .Append(context.Auto().Regression("y", useLbfgs: false, useSdca: false, useFastForest: false));

var monitor = new NotebookMonitor();
var experiment = context.Auto().CreateExperiment();
experiment.SetPipeline(pipeline)
  .SetEvaluateMetric(RegressionMetric.RootMeanSquaredError, "y")
  .SetTrainingTimeInSeconds(30)
  .SetDataset(trainTestSplit.TrainSet, trainTestSplit.TestSet)
  .SetMonitor(monitor);

// Configure Visualizer         
monitor.SetUpdate(monitor.Display());

var res = await experiment.RunAsync();

Update

Initialize MLContext

MLContext is the starting point for all ML.NET applications.

var context =new MLContext(seed: 1);

Define training pipeline

var pipeline = context.Transforms.Concatenate("Features", "X")
      .Append(context.Auto().Regression("y", useLbfgs: false, useSdca: false, useFastForest: false));

Initialize Monitor

The notebook monitor provides visualizations of the training progress as AutoML tries to find the best model for your data.

var monitor = new NotebookMonitor();

Initialize AutoML Experiment

An AutoML experiment is a collection of trials in which algorithms are explored.

var experiment = context.Auto().CreateExperiment();

Configure AutoML Experiment

The AutoML experiment tries to find the best algorithm using an evaluation metric. In this case, the evaluation metric selected is Root Mean Squared Error. The goal is to find the optimal evaluation metric in the provided training time which is set to 30 seconds. The longer you train, the more algorithms and hyperparameters AutoML is able to explore. The training set is the dataset that AutoML uses to train the model and the test set is used to calculate the evaluation metric to see how well a particular model selected by AutoML performs.

experiment.SetPipeline(pipeline)
        .SetEvaluateMetric(RegressionMetric.RootMeanSquaredError, "y")
        .SetTrainingTimeInSeconds(30)
        .SetDataset(trainTestSplit.TrainSet, trainTestSplit.TestSet)
        .SetMonitor(monitor);

Set monitor to display

monitor.SetUpdate(monitor.Display());

Run AutoML experiment

var res = await experiment.RunAsync();