Open Y-oHr-N opened 5 years ago
Just for clarification, do you think that these should be part of the pipeline tuned by Auto-sklearn or that there should be a standalone mode AutoSklearnOutlierDetector
?
According to the title you want the second thing. From my understanding, this is an unsupervised learning problem. The central assumption in Auto-sklearn is that there as a loss function which can be used to tune the hyperparameters. What would such a loss function look like for outlier detection?
Thank you for your reply. As far as I know, threre are two metrics for outlier function.
One is the square of the geometric mean of precision and recall.
outliers - Metrics for one-class classification - Cross Validated https://stats.stackexchange.com/questions/192530/metrics-for-one-class-classification Lee, W. S, and Liu, B., "Learning with positive and unlabeled examples using weighted Logistic Regression," In Proceedings of ICML, pp. 448-455, 2003. https://www.aaai.org/Papers/ICML/2003/ICML03-060.pdf
The other is the area under the Mass-Volume curve.
Goix, N., "How to evaluate the quality of unsupervised anomaly detection algorithms?" In ICML Anomaly Detection Workshop, 2016. https://arxiv.org/pdf/1607.01152.pdf Thomas, A., Clémençon, S., Feuillard, V., and Gramfort, A., "Learning hyperparameters for unsupervised anomaly detection," In ICML Anomaly Detection Workshop, 2016. https://github.com/albertcthomas/anomaly_tuning
I implemented two scikit-learn compatible metrics. https://github.com/HazureChi/kenchi/blob/master/kenchi/metrics.py
I'm afraid that I won't have the time to implement something here. Also, I think this is somewhat out of scope for Auto-sklearn if the metrics are not in scikit-learn yet.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs for the next 7 days. Thank you for your contributions.
Hi @mfeurer,
Is it possible to create a customized one-class SVM as a two-class SVM, and then put it into AutoSklearnClassifier? What I'm trying to do is
Does it sound reasonable and workable?
Any comments are highly appreciated.
JM
Hello,
scikit-learn 0.20 provides more consistent outlier detection API. https://speakerdeck.com/albertcthomas/anomaly-detection-in-scikit-learn-ongoing-work-and-future-developments
So I want an estimator that fits all outlier detection models like AutoSklearnClassifier.
Thank you.