Boehringer-Ingelheim / anomaly-detection-in-histology

Learning image representations for anomaly detection: application to discovery of histological alterations in drug development
MIT License
17 stars 1 forks source link

The anomaly score received from anomaly_detector.py #4

Closed Akihiko0123 closed 9 months ago

Akihiko0123 commented 9 months ago

I believe generally scores gotten from decision_function of svm.OneClassSVM means normal when positive value like 0.157952692864881 and anomaly when negative value like 0.44923574585746. Would you mind letting me know if the score from anomaly_detector.py is the same as this svm.OneClassSVM? I would appreciate it if you could tell me if my understanding is wrong. Thank you.

igsing commented 9 months ago

It depends which score variable within the code you are talking about. anomaly_detection() function within anomaly_detector.py returns negative values for predicted anomalies because it uses decision_function() from scikitlearn you are talking about. In the main part of anomaly_detector.py, however, this is flipped in some places, see the line 529, so that all_scores variable has positive values for anomalies.

Akihiko0123 commented 9 months ago

It depends which score variable within the code you are talking about. anomaly_detection() function within anomaly_detector.py returns negative values for predicted anomalies because it uses decision_function() from scikitlearn you are talking about.

Thank you so much for your reply. I understood when using decision_function() of scikitlearn, returns negative values for predicted anomalies. (especially when training svm model with normal data only like your research) And I understood also that the value is flipped in the main part of anomaly_detector.py.

I am using decision_function() from scikitlearn same as your anomaly_detector.py. (but not flipped the score) However, the scores from decision_function() of scikitlearn showed the negative value for the normal data and the positive value for the anomaly data for several times. I mean , smaller value like -5.3 tend to be the normal data and larger value like 1.5 tend to be the anomaly data. I am checking the details and the cause of this result. Thank you again!