Closed kordc closed 1 year ago
@yzhao062 could you say which version of PyOD you used in the ADBench? I tried release 1.0.1 and it also has those results visible above.
isolation forest is a bit weird since I did not implement it but imported it from scikit-learn
see here: https://github.com/yzhao062/pyod/blob/master/pyod/models/iforest.py
from sklearn.ensemble import IsolationForest
One reason is that the hyperparameter setting and some built-in randomness of iforest.
according to adbench (https://github.com/Minqi824/ADBench), we use pyod 1.0.0 for this purpose. One thing to note is, we do not predict labels but predicts raw outlier scores for ROC...
y_pred should be continuous scores, and y should be binary labels for ROC.
That's an important note, I'll rerun my experiments in this manner, thanks!
This explains everything, results are now reproducible. Thank you for the help!
I was trying to explore the PyOD functionality, and I can see, that the results by default are very low compared to the ADBench paper. Moreover, when I compare the Isolation Forest, the sklearn implementation just performs better. I used fixed random_state so you can reproduce it.
Isolation Forest
The result is 0.6407. Complared to the sklearn:
The above result of 0.956 corresponds more to the paper, where score is equal to 0.9832
ECOD
The above 0.649 doesn't correspond to the paper's 99.17.
HBOS
The above 0.644 doesn't correspond to the paper's 98.94.
I use the PyOD in the simplest possible manner, and I checked the dataset with the isolation forest. That's why I think something is wrong.
The dataset is available in the ADBench repository - breastw