yzhao062 / pyod

A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques
http://pyod.readthedocs.io
BSD 2-Clause "Simplified" License
8.52k stars 1.36k forks source link

Merge with kenchi #28

Closed Y-oHr-N closed 3 years ago

Y-oHr-N commented 5 years ago

Hi,

I am currently developing an anomaly detection package called kenchi and would like to merge this code into your package. https://github.com/HazureChi/kenchi

There are three points that I can contribute to pyod.

The first is the implementation of One-time sampling. https://github.com/HazureChi/kenchi/blob/master/kenchi/outlier_detection/distance_based.py

Sugiyama, M., and Borgwardt, K., "Rapid distance-based outlier detection via sampling," Advances in NIPS, pp. 467-475, 2013.

The second is the implementation of metrics for outlier function. https://github.com/HazureChi/kenchi/blob/master/kenchi/metrics.py

Lee, W. S, and Liu, B., "Learning with positive and unlabeled examples using weighted Logistic Regression," In Proceedings of ICML, pp. 448-455, 2003.

Goix, N., "How to evaluate the quality of unsupervised anomaly detection algorithms?" In ICML Anomaly Detection Workshop, 2016.

The last is the implementation of the function that loads and return various datasets. https://github.com/HazureChi/kenchi/blob/master/kenchi/datasets/base.py

If you agree, I actively would like to contribute to pyod in the future.

Thanks.

yzhao062 commented 5 years ago

Hi Kon,

Thanks for reaching out and I am so happy you are willing to help:) Indeed I starred kenchi and like it; those two libraries have many overlappings as well. You are very welcome to contribute to pyod. I believe the points you raised above, such as metrics, can be good additions.

The only pity is that you may miss the opportunity of becoming a co-author of the latest pyod paper (submitted to JMLR) since it is almost done. You are assured that we will add you as a co-author for any future publications though. I am not sure whether this will be a problem for you...

I also need to apologize first there are many discrepancies/confusions you may find when checking out pyod codes. For instance, we are refactoring all docstring from rst to numpydoc. I also missed writing the documentation for how to contribute, but I would not be worried too much since you are an experienced open-source developer:)

So please let me know whether that aforementioned publication thing would be a concern. If not, you are welcome to contribute and I am looking forward to collaborating!

Besides talking on GitHub, feel free to drop me an email or through other channels (slack, whatsapp, or telegram etc.), whichever you prefer.

Cheers, Yue

Y-oHr-N commented 5 years ago

Thanks for your agreeing to merge with kenchi. I will open pull requests soon.

I am also interested in becoming co-author for any future publications and would like to talk with email or slack.