dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
MIT License
8.94k stars 1.86k forks source link

'Label' not found #6730

Open thoron opened 1 year ago

thoron commented 1 year ago

System Information (please complete the following information):

Describe the bug

An error for missing Label in Schema when trying to load text (csv) without header row.

Row example:


To Reproduce

var ctx = new MLContext(1);
var opts = new TextLoader.Options
    HasHeader = false,
    Columns = new[]
        new TextLoader.Column("Label", DataKind.UInt32, 0),
        new TextLoader.Column("Features", DataKind.Single, 1, 29)
    Separators = new[] {';'},
var loader = ctx.Data.CreateTextLoader(opts);
var data = loader.Load(@"C:\test.csv");
var trainValidationData = ctx.Data.TrainTestSplit(data, testFraction: 0.2);
var pipeline = ctx.Auto()
var xx = ctx.Auto()

Removing Featurizer does not produce any different result, same error.

var pipeline = ctx.Transforms.Conversion.MapValueToKey("Label")

Generates error:

System.AggregateException : One or more errors occurred. (label column 'Label' not found (Parameter 'schema'))
  ----> System.ArgumentOutOfRangeException : label column 'Label' not found (Parameter 'schema')
  ML_IsMarked: 1
   at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
   at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
   at Microsoft.ML.AutoML.AutoMLExperiment.Run()

Loaded data looks as expected: image

Expected behavior

Loading schema for AutoML when Label has been specified.

Might be due to missing header row and/or not using InferColumns. Schema looks fine on runtime manual inspection, am I missing something?

fwaris commented 3 months ago

I am running into a similar problem. In my case, the experiment uses Binary Classification.

It seems that whatever dataview the evaluator sees, does not have the Label column.

System.ArgumentOutOfRangeException: label column 'Label' not found (Parameter 'schema')
   at Microsoft.ML.Data.RoleMappedSchema.MapFromNames(DataViewSchema schema, IEnumerable`1 roles, Boolean opt)
   at Microsoft.ML.Data.RoleMappedSchema..ctor(DataViewSchema schema, IEnumerable`1 roles, Boolean opt)
   at Microsoft.ML.Data.RoleMappedData..ctor(IDataView data, Boolean opt, KeyValuePair`2[] roles)
   at Microsoft.ML.Data.BinaryClassifierEvaluator.Evaluate(IDataView data, String label, String score, String predictedLabel)
fwaris commented 3 months ago

I dug deeper into AutoML code and found that label column for the evaluator is always 'label' (lower case).


I renamed "Label" to "label" everywhere and that fixed this issue