dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
9.05k stars 1.89k forks source link

LightGBM Version #7045

Open boneatjp opened 8 months ago

boneatjp commented 8 months ago

System Information (please complete the following information):

Describe the bug

System.AccessViolationException HResult=0x80004003 Message=Attempted to read or write protected memory. This is often an indication that other memory is corrupt.

To Reproduce Steps to reproduce the behavior:

  1. Make a console app project
  2. Install following packages with NuGet Package Manager: Microsoft.ML Version 3.0.1 Microsoft.ML.AutoML Version 0.21.1 Microsoft.ML.CpuMath Version 3.0.1 Microsoft.ML.DataView Version 3.0.1 Microsoft.ML.FastTree Version 3.0.1 Microsoft.ML.LightGbm Version 3.0.1 Microsoft.ML.Mkl.Compreonents Version 3.0.1 Microsoft.ML.Mkl.Redist Version 3.0.1
  3. Edit Program.cs:
    
    using Microsoft.ML;
    using Microsoft.ML.AutoML;
    using Microsoft.ML.Data;
    using static Microsoft.ML.DataOperationsCatalog;

MLContext mlContext; string sampleData = "taxi-fare-full.csv"; mlContext = new MLContext();

// Infer column information ColumnInferenceResults columnInference = mlContext.Auto() .InferColumns(sampleData, labelColumnName: "fare_amount", groupColumns: false); // Create text loader TextLoader loader = mlContext.Data.CreateTextLoader(columnInference.TextLoaderOptions);

// Load data into IDataView IDataView data = loader.Load(sampleData);

TrainTestData trainValidationData = mlContext.Data.TrainTestSplit(data, testFraction: 0.2);

SweepablePipeline pipeline = mlContext.Auto() .Featurizer(data, columnInformation: columnInference.ColumnInformation) .Append(mlContext.Auto() .Regression(labelColumnName: columnInference.ColumnInformation.LabelColumnName));

AutoMLExperiment experiment = mlContext.Auto().CreateExperiment();

var regressionMetric = RegressionMetric.RootMeanSquaredError; experiment .SetPipeline(pipeline) .SetRegressionMetric(regressionMetric, labelColumn: columnInference.ColumnInformation.LabelColumnName) .SetTrainingTimeInSeconds(100) // Training time in sec .SetDataset(trainValidationData);

// Log experiment trials mlContext.Log += (_, e) => { if (e.Source.Equals("AutoMLExperiment")) { Console.WriteLine(e.RawMessage); } };

TrialResult experimentResults = await experiment.RunAsync();


4.Download the [taxi-fare-full.csv](https://github.com/dotnet/machinelearning-samples/blob/main/samples/csharp/getting-started/Regression_TaxiFarePrediction/TaxiFarePrediction/Data/taxi-fare-full.csv) and set it to be copied.
5.This works without  errors. Add LightGBM version 4.0.0 or latest 4.3.0 with NuGet Package Manager.
6. See error

**Additional context**
Without installing LightGBM version 4.?.?, ML.NET version 3.0.1 uses LightGBM version 3.3.5.
I've asked [Version 3.3.5 to Version 4.3.0](https://github.com/microsoft/LightGBM/issues/6309) at microsoft/LightGBM.
LightGBM Version 4.3.0 itself maybe works fine. So it might be the problem how ML.NET calls LightGBM.
evo11x commented 3 months ago

I am having the same problem with LightGBM 4.4.0 multiclass classification using .Net framework 4.8.1

boneatjp commented 3 months ago

Well, a few days ago, ML.NET 4.0.0-preview.24378.1 had been released. I've tried with LightGBM 4.5.0 but got still the same error. I hope ML.NET 4.0.0 will use LightGBM 4.5.0 or later version or maybe at least LightGBM 4.0.0.