CADWRDeltaModeling / dms_datastore

Data download and management tools for continuous data for Pandas. See documentation https://cadwrdeltamodeling.github.io/dms_datastore/
https://cadwrdeltamodeling.github.io/dms_datastore/
MIT License
1 stars 0 forks source link

use local outlier factor LOF for outlier detection #42

Open dwr-psandhu opened 8 months ago

dwr-psandhu commented 8 months ago

This is available in sklearn and I am adding it with a test and function. Could you review it and see if this is a right way to add a outlier detection function.

dwr-psandhu commented 8 months ago

See commit https://github.com/CADWRDeltaModeling/dms_datastore/commit/7606c04c9f36d5f38743fea9e831effbf6d5dc9c

water-e commented 8 months ago

An anomaly detector is a just a function that takes named args and returns a boolean series or dataframe with True indicating anomaly. You don't need to append columns or anything like that. Eventually I may re-add things like "and" and "or" as in ADTK, although complex rules haven't proven useful so far. I would look at the inputs and outputs of something simple in vtools error_detect.py and also look at the yaml config file.