dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
8.91k stars 1.86k forks source link

What is the best way to represent N/A if a value is not available? #7038

Open thomasd3 opened 4 months ago

thomasd3 commented 4 months ago

I'm using a binary classifier.

My feature list has a list of columns representing various states in my model. The range of values is -1 to +1. But, in some cases, some states are simply not present, in some rows.

For example:

0, 1, 1, 1, 0, 0, 1
1, 0, 0, 1, 1, 0, 0
1, X, 1, 1, 1, 0, 1
1, 0, 0, 0, 0, 1, 1

Notice the X? which really means, in my model, that there is an absence of data there. The model is supposed to classify a specific situation based on the states of various systems. And sometimes that data is not present, shouldn't be interpolated from neighbors either, it really means that this signal does not exist at that time.

What is the best way to represent this?