dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
9.02k stars 1.88k forks source link

IndexOutOfRangeException when Training TextClassification #6477

Closed aforoughi1 closed 1 year ago

aforoughi1 commented 1 year ago

System Information (please complete the following information):

<PackageReference Include="Microsoft.ML.AutoML" Version="0.20.0" />
<PackageReference Include="Microsoft.ML.TorchSharp" Version="0.20.0" />
<PackageReference Include="TorchSharp" Version="0.99.0" />

Describe the bug My source data structure

Sentence is string
 Sentiment is a string: 'positive', 'negative' or 'neutral'
 Label = {0:'neutral', 1:'positive',-1:'negative'}  

I reused the sample from https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/MLNET2/AutoMLTrialRunner

I get the exception when I retrieve the Model from TrialResult

System.IndexOutOfRangeException
  HResult=0x80131508
  Message=Index was outside the bounds of the array.
  Source=Microsoft.ML.AutoML
  StackTrace:
   at Microsoft.ML.AutoML.PipelineProposer.ProposeSearchSpace()
   at Microsoft.ML.AutoML.EciCostFrugalTuner.Propose(TrialSettings settings)
   at Microsoft.ML.AutoML.AutoMLExperiment.<RunAsync>d__26.MoveNext()

To Reproduce

`           string dataPath = @"FinancialPhraseBank.txt";
            var columns = new[]
            {
                new TextLoader.Column("Sentence",DataKind.String,0),
                new TextLoader.Column("Sentiment",DataKind.String,1),
                new TextLoader.Column("Label",DataKind.Single,2)

            };

            var loaderOptions = new TextLoader.Options()
            {
                Columns = columns,
                HasHeader = true,
                Separators = new[] { '\t' },
            };

            var textLoader = mlContext.Data.CreateTextLoader(loaderOptions);
            var data = textLoader.Load(dataPath);
            var preview = data.Preview(6000);

            var tcSearchSpace = new SearchSpace<TCOption>();
            var tcFactory = (MLContext ctx, TCOption param) =>
            {
                return mlContext.MulticlassClassification.Trainers.TextClassification( sentence1ColumnName: "Sentence", batchSize: param.BatchSize);
            };

            var tcEstimator = mlContext.Auto().CreateSweepableEstimator(tcFactory, tcSearchSpace);
            var pipeline = mlContext.Transforms.Conversion.MapValueToKey("Label").Append(tcEstimator);
            TrainTestData trainValidationData = mlContext.Data.TrainTestSplit(data, testFraction: 0.2);
            var tcRunner = new TCRunner(context: mlContext, data: trainValidationData, pipeline: pipeline);
            var monitor = new AutoMLMonitor(pipeline);

            //mlContext.Log += (o, e) =>
            //{
            //    if (e.Source.Contains("AutoMLExperiment")) Console.WriteLine(e.RawMessage);
            //    if (e.Source.Contains("NasBertTrainer")) Console.WriteLine(e.Message);
            //};

            AutoMLExperiment experiment = mlContext.Auto().CreateExperiment();    
            experiment.SetPipeline(pipeline)
                        .SetMulticlassClassificationMetric(MulticlassClassificationMetric.MicroAccuracy, labelColumn: "Label")
                        .SetTrainingTimeInSeconds(60)
                        .SetDataset(trainValidationData)
                        .SetTrialRunner(tcRunner)
                        .SetMonitor(monitor);

            var tcCts = new CancellationTokenSource();
            TrialResult textClassificationExperimentResults = await experiment.RunAsync(tcCts.Token);
            var model = textClassificationExperimentResults.Model;`

Additional context A cutdown sample data file is attached FinancialPhraseBank.txt

aforoughi1 commented 1 year ago

Does the API only work for two classes as in your sample? The API didn't work in preview version and remains a showstopper in the release version. Please can you investigate?

aforoughi1 commented 1 year ago

The issue caused by libtorchsharp . I had installed only torchsharp package. The sample TCRunner.Run() masked the exception. To resolve it I next installed torchsharp-cpu package. The pipeline. Fit() took 38 mins. Useful to add the debugging code mlContext.Log += (o, e) => { if (e.Source.Contains("NasBertTrainer")) Console.WriteLine(e.Message); };

System.DllNotFoundException
  HResult=0x80131524
  Message=Unable to load DLL 'LibTorchSharp' or one of its dependencies: The specified module could not be found. (0x8007007E)
  Source=TorchSharp
  StackTrace:
   at TorchSharp.PInvoke.LibTorchSharp.THSNN_custom_module(String name, ForwardFunctionC forward, IntPtr& pBoxedModule)
   at TorchSharp.torch.nn.Module..ctor(String name)
   at Microsoft.ML.TorchSharp.NasBert.Models.BaseModel..ctor(Options options)
   at Microsoft.ML.TorchSharp.NasBert.Models.NasBertModel..ctor(Options options, Int32 padIndex, Int32 symbolsCount, Int32 numClasses)
   at Microsoft.ML.TorchSharp.NasBert.NasBertTrainer`2.TrainerBase..ctor(NasBertTrainer`2 parent, IChannel ch, IDataView input)
   at Microsoft.ML.TorchSharp.NasBert.TextClassificationTrainer.CreateTrainer(NasBertTrainer`2 parent, IChannel ch, IDataView input)
   at Microsoft.ML.TorchSharp.NasBert.NasBertTrainer`2.Fit(IDataView input)
   at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input)
   at Test.TCRunner.Run(TrialSettings settings)