dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
9k stars 1.88k forks source link

Support One Class Classification in Ml.net #5878

Open Marios7 opened 3 years ago

Marios7 commented 3 years ago

Is your feature request related to a problem? Please describe. I am building an image classifieres using transfer learning, it's all ok till we get the problem of binary classification, in our case it's (Documents vs Not Documents), it does not take long till someone realizes that unlike the data in documents class, the amount of data in the Not Document class is infinite!

Describe the solution you'd like One way to solve this problem is by using a feature extractor tensorflow model(InceptionV3, ResNet ...etc.) - which is possibile in Ml.Net by the way - in order to obtain the features' vector from the penultimate layer of the network, and then feed this vector to another One-Class-Classification/Anomaly Detection algorithm like One-Class-SVM which is NOT available in Ml.net.

Describe alternatives you've considered I tried data augmentation, adding data to the data set, changing the architecture of the network, extracting features and use algorithms other than CNN to train a model with the extracted features' vector, changing the hyperparameters, but non of these methods seems promising.

Additional context We are searching to solve an anomaly detection problem then doing a classification. the way we intend to that in is extracting the features from images with (ResNet50, ResNet101 etc.) to feed them to an anomaly detection algorithm.

luisquintanilla commented 3 years ago

Hi @Marios7

Based on the solution you'd like, this might be something you're interested in.

https://docs.microsoft.com/dotnet/machine-learning/tutorials/image-classification

In this scenario a TensorFlow model is used to featurize the images and the classifier is an ML.NET trainer.

michaelgsharp commented 3 years ago

Thanks @luisquintanilla for the input.

@Marios7 can you take a look and see if that helps? If not let us know.

Marios7 commented 3 years ago

Hi @luisquintanilla @michaelgsharp thanks for your replies, as I said in my question in the "Describe altrenatives you've considered" section:

extracting features and use algorithms other that CNN to train a model with the extracted features' vector

I already tried that, and it's not enough, actually what I'am asking for is the second step of this same approch, I'am searching for an algorithm that can perform One-Class-Classification in ML.Net, in other words, after extracting my features, I need to pass them to some kind of OCC algorithm like One-Class-SVM, so that I can Guess if an image belongs to the positive class or it does not (Like Cat vs NO-Cat), here the "NO-Cat" class is all the images in the world except those that really are Cats. I am not trying to classfy these images into Categories (Cat Vs Dog).