dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
8.92k stars 1.86k forks source link

mlContext.Model.Load throws System.InvalidOperationException: Tensor invalid -- empty handle. #6941

Closed kicaj29 closed 5 months ago

kicaj29 commented 5 months ago

System Information (please complete the following information):

Describe the bug The following code throws exception: System.InvalidOperationException: Tensor invalid -- empty handle. when mlContext.Model.Load is called second time.

To Reproduce Save model by calling ctx.Model.Save and next try to load this model twice - one by one using mlContext.Model.Load where mlContext are separate instances.

Expected behavior It is possible to load second time the same model file when previous processing is finished.

Screenshots, Code, Sample Projects

    public class ModelInput
    {
        [LoadColumn(0)]
        [ColumnName(@"Words")]
        public string Words { get; set; }

        [LoadColumn(1)]
        [ColumnName(@"ClassId")]
        public float ClassId { get; set; }

    }
    public static class TestModelLoading
    {
        public static void Run()
        {
            ModelInput modelInput = new ModelInput()
            {
                Words = "This is first example C",
                ClassId = -1,
            };
            MLContext mlContext = new MLContext();
            DataViewSchema modelSchema;
            Console.WriteLine("Before Load");
            ITransformer trainedModel = mlContext.Model.Load($"DocumentClassificationTest.zip", out modelSchema);
            Console.WriteLine("After Load");
            IDataView testDataView = mlContext.Data.LoadFromEnumerable<ModelInput>(new ModelInput[] { modelInput }, modelSchema);
            // Define text transform estimator
            var textEstimator = mlContext.Transforms.Text.FeaturizeText("Words");
            // Fit data to estimator
            // Fitting generates a transformer that applies the operations of defined by estimator
            ITransformer textTransformer = textEstimator.Fit(testDataView);
            IDataView testDataViewForPrediction = textTransformer.Transform(testDataView);
            IDataView result = trainedModel.Transform(testDataViewForPrediction);
        }
    }
using Microsoft.ML.AutoML;
using Microsoft.ML;
using TextClassificationError;
using static Microsoft.ML.DataOperationsCatalog;
using Microsoft.ML.TorchSharp.NasBert;
using Microsoft.ML.TorchSharp;
using Microsoft.ML.Data;

Console.WriteLine("Starting...");

List<ModelInput> modelInput = new List<ModelInput>();
modelInput.Add(new ModelInput()
{
    ClassId = 1,
    Words = "This is first example A"
});
modelInput.Add(new ModelInput()
{
    ClassId = 1,
    Words = "This is first example B"
});
modelInput.Add(new ModelInput()
{
    ClassId = 2,
    Words = "This is second example A"
});
modelInput.Add(new ModelInput()
{
    ClassId = 2,
    Words = "This is second example B"
});
modelInput.Add(new ModelInput()
{
    ClassId = 3,
    Words = "This is third example A"
});
modelInput.Add(new ModelInput()
{
    ClassId = 3,
    Words = "This is third example B"
});

MLContext ctx = new MLContext();

IDataView data = ctx.Data.LoadFromEnumerable<ModelInput>(modelInput);
TrainTestData trainValidationData = ctx.Data.TrainTestSplit(data, testFraction: 0.3);

ColumnInformation colInfo = new ColumnInformation();
colInfo.TextColumnNames.Add("Words");
colInfo.NumericColumnNames.Add("DocId");
colInfo.LabelColumnName = "ClassId";

// Define text classification pipeline
// Create a pipeline for training the model
var pipeline = ctx.Transforms.Conversion.MapValueToKey(outputColumnName: "ClassId", inputColumnName: "ClassId")
                        .Append(ctx.MulticlassClassification.Trainers.TextClassification(
                            labelColumnName: "ClassId",
                            sentence1ColumnName: "Words",
                            architecture: BertArchitecture.Roberta))
                        .Append(ctx.Transforms.Conversion.MapKeyToValue(outputColumnName: "PredictedLabel", inputColumnName: "PredictedLabel"));

// Train the model using the pipeline
Console.WriteLine("Training model...");
ITransformer model = pipeline.Fit(trainValidationData.TrainSet);

// Evaluate the model's performance against the TEST data set
Console.WriteLine("Evaluating model performance...");
// We need to apply the same transformations to our test set so it can be evaluated via the resulting model
IDataView transformedTest = model.Transform(trainValidationData.TestSet);
MulticlassClassificationMetrics metrics = ctx.MulticlassClassification.Evaluate(transformedTest, labelColumnName: "ClassId");

Console.WriteLine(metrics.ConfusionMatrix.GetFormattedConfusionTable());

ctx.Model.Save(model, trainValidationData.TrainSet.Schema, $"DocumentClassificationTest.zip");

Console.WriteLine("Run1 start");
TestModelLoading.Run();
Console.WriteLine("Run1 end");
Console.WriteLine("Run2 start");
TestModelLoading.Run(); // Second run throws exception.
Console.WriteLine("Run2 end");

Console.WriteLine("Finished.");

csproj (these are the same version which are used here https://github.com/dotnet/machinelearning/blob/main/eng/Versions.props#L65-L66)

<Project Sdk="Microsoft.NET.Sdk">
    <PropertyGroup>
        <OutputType>Exe</OutputType>
        <TargetFramework>net8.0</TargetFramework>
        <ImplicitUsings>enable</ImplicitUsings>
        <Nullable>enable</Nullable>
    </PropertyGroup>
    <ItemGroup>
        <PackageReference Include="Microsoft.Extensions.ML" Version="3.0.0" />
        <PackageReference Include="Microsoft.ML" Version="3.0.0" />
        <PackageReference Include="Microsoft.ML.AutoML" Version="0.21.0" />
        <PackageReference Include="Microsoft.ML.TorchSharp" Version="0.21.0" />
        <PackageReference Include="TorchSharp" Version="0.99.5" />
        <PackageReference Include="libtorch-cpu-win-x64" Version="1.13.0.1" />
    </ItemGroup>
</Project>

Source code which can be used to reproduce this problem: TextClassificationError.zip

image

Additional context n/a

michaelgsharp commented 5 months ago

If you make sure to dispose of the model at the end of your TestModelLoading it will resolve the error. (trainedModel as IDisposable).Dispose()

This error is coming from TorchSharp itself. I'm checking if the latest version of TorchSharp has this fixed. If not I'll raise an issue there, but other than making sure to dispose of your model when you are done with it there isnt much more we can do from this side.

Closing this issue for now, but I'll mention it in the TorchSharp issue when I create it.

michaelgsharp commented 5 months ago

This has actually been fixed already in the latest version of TorchSharp. Almost have ML.NET ready to update to use that version.

kicaj29 commented 5 months ago

With the change you showed about my example works fine, thank you. It would be good if the trainedModel would be implementing IDisposable directly - then it would be obvious that Dispose has to be called.