Open kant2002 opened 5 years ago
Small note: that in Debug configuration exception produce correct behavoir, but in Release mode no exceptions is produced on assertion.
I appear to be having a similar issue.
The file itself exists, I've put this below and it passes without issue.
if (!File.Exists(filename)) throw new Exception("The data file doesn't exist.");
When calling the below code, it throws a NullReferenceException with the following stack trace.
var dataView = mlContext.Data.LoadFromTextFile<WineQualityData>(filename, separatorChar: ';', hasHeader: true);
System.NullReferenceException HResult=0x80004003 Message=Object reference not set to an instance of an object. Source=Microsoft.ML.Data StackTrace: at Microsoft.ML.Data.TextLoader.CreateTextLoader[TInput](IHostEnvironment host, Boolean hasHeader, Char separator, Boolean allowQuoting, Boolean supportSparse, Boolean trimWhitespace, IMultiStreamSource dataSample) at Microsoft.ML.TextLoaderSaverCatalog.LoadFromTextFile[TInput](DataOperationsCatalog catalog, String path, Char separatorChar, Boolean hasHeader, Boolean allowQuoting, Boolean trimWhitespace, Boolean allowSparse) at TestMSML.Program.DoThing(String filename) in ...
If it helps, I'm working on a .NET Framework 4.6.1 console application (not .NET Core).
Any thoughts on a workaround?
Some more info. It works if you create the text loader and define the data structure manually instead of using a class.
Using a class, doesn't work.
var textReader = mlContext.Data.CreateTextLoader<WineQualityData>(separatorChar: ';', hasHeader: true);
Creating the structure manually, does work.
var textLoader = mlContext.Data.CreateTextLoader(new TextLoader.Column[] {
new TextLoader.Column("fixed acidity", DataKind.Single, 0),
new TextLoader.Column("volatile acidity", DataKind.Single, 1),
...
new TextLoader.Column("quality", DataKind.Single, 9),
}, separatorChar: ';', hasHeader: true );
Make sure you have fields mapping for the training data-struc:
public class SentimentRow
{
[LoadColumn(0)]
public bool Sentiment { get; set; }
[LoadColumn(1)]
public string SentimentText { get; set; }
}
System information
Issue
Let's say we do loading of data from CSV file using simple POCO class and forget to add
LoadColumn
attribute on the properties. Then call toCreateTextLoader<T>/CreateTextReader<T>
fails with NullReferenceExceptionwhich definitely not user friendly. I track down that to line https://github.com/dotnet/machinelearning/blob/master/src/Microsoft.ML.Data/DataLoadSave/Text/TextLoader.cs#L1344
where you delegate assertion to IHostEnvironment since I running LocalEnvironment, I believe that by default when running in Local environment proper default behavior would be just throw. Right now I could not even imaging that such big usability mistake was made by MS, so I have to manually clone project and compile it locally to track down this error.
Source code / logs