yzhao062 / pyod

A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques
http://pyod.readthedocs.io
BSD 2-Clause "Simplified" License
8.58k stars 1.37k forks source link

Train/Test split or not? #468

Open lorisgir opened 1 year ago

lorisgir commented 1 year ago

Hi, I have a conceptual question wether I should split my dataset in train/test or not. Given the fact that my dataset has no labels, does it make any sense to split in the first place? I mean, I could simply do something like clf.fit(data) and then get the resulting labels as clf.labels_ and since I train in an unsupervised manner the classifier should not overfit in any way, right?

yzhao062 commented 1 year ago

for unsupervised outlier detection there is no need to split your data into train and test. just fit all as you mentioned