vc1492a / PyNomaly

Anomaly detection using LoOP: Local Outlier Probabilities, a local density based outlier detection method providing an outlier score in the range of [0,1].
Other
312 stars 37 forks source link

Changes to distance matrix measure to improve speed #39

Closed nghiadanh26 closed 4 years ago

nghiadanh26 commented 4 years ago

Updated my code to loop.py, including a function name "distance_dn" to calculate distance_matrix and "distance_point_dn" calculate distance from a point to training data. Added some time.time() functions to estimate time for my code.

vc1492a commented 4 years ago

Relates to this issue.

vc1492a commented 4 years ago

@nghiadanh26 thanks for opening this PR, I took a look at the changes.

While it does run significantly faster, I am hesitant to pursue this further at the moment as it seems your changes alter the results a non-negligible amount versus the results from the original paper.

The below plot shows the LoOP values calculated for a set of values from the original paper.

Screen Shot 2020-06-06 at 6 01 43 PM

The file examples/multiple_gaussian_2d.py can be used to create a plot very similar to that provided in the original work, which can be used as a check to ensure that the PyNomaly implementation is as close to the original work as possible. When using the current PyNomaly code, we get the following result:

Screen Shot 2020-06-06 at 6 01 53 PM

And using the code in this PR:

Screen Shot 2020-06-06 at 6 02 00 PM

It looks like your proposed changes would increase the difference between the results provided from PyNomaly versus those from the original work. Do you think you would be able to identify the cause of this increased error in the results when compared to the original work?

nghiadanh26 commented 4 years ago

Hi @vc1492a, Thanks for looking at my code. I understood your consideration, but now I cannot answer why the differences between PR's results and original results are non-negligible. I'll go deeper inside my code and also LoOP algorithm and I hope I can explain it soon.

vc1492a commented 4 years ago

Any updates @nghiadanh26?

nghiadanh26 commented 4 years ago

Not yet. But I will be soon!

vc1492a commented 4 years ago

Any updates @nghiadanh26? I'd close to close this PR soon if there isn't a plan to explore the above, thanks!

vc1492a commented 4 years ago

Closing this PR and issue #38 due to inactivity and not resolving the above concern (discrepancy with original paper).