yzhao062 / pyod

A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques
http://pyod.readthedocs.io
BSD 2-Clause "Simplified" License
8.27k stars 1.35k forks source link

DIF model: duplicate normalization #546

Open ValeZ1 opened 4 months ago

ValeZ1 commented 4 months ago

On the DIF (Deep Isolation Forest) model, in the fit function, the variable X is normalized. Then it is passed to decision_function to compute the decision_scores_, where it is normalized again. This results in a mismatch between decision_scores_ and scores obtained by calling decision_function(X) on the same X.

Normalization: https://github.com/yzhao062/pyod/blob/690a0f25987fab0664b014bbc7121d999c92f5f6/pyod/models/dif.py#L173-L175

decision_function call: https://github.com/yzhao062/pyod/blob/690a0f25987fab0664b014bbc7121d999c92f5f6/pyod/models/dif.py#L215-L216

Normalization in decision_function: https://github.com/yzhao062/pyod/blob/690a0f25987fab0664b014bbc7121d999c92f5f6/pyod/models/dif.py#L241