rstudio / pointblank

Data quality assessment and metadata reporting for data frames and database tables
https://rstudio.github.io/pointblank/
Other
845 stars 51 forks source link

Anomaly detection in table values #246

Open rich-iannone opened 3 years ago

rich-iannone commented 3 years ago

It would be great for time-series data (or univariate data) to detect anomalous data in a table. This needs to work well with both data frames and database tables.

ArmanAttaran commented 3 years ago

Hi, you may find this paper interesting and applicable https://arxiv.org/pdf/1910.01793.pdf

rich-iannone commented 3 years ago

@ArmanAttaran Thanks for forwarding this to me. It was a good read, unfortunately there’s no R package by the authors for using their methodology. I found an earlier R package by one of the authors use part of what they described but it’s a much earlier work.

ArmanAttaran commented 3 years ago

Ok I will take a crack at it for my own use and keep you posted; it will most likely use Stan as the engine so I’m not sure if it will be useful for you.

On Wed, Mar 10, 2021 at 8:04 PM Richard Iannone notifications@github.com wrote:

@ArmanAttaran https://github.com/ArmanAttaran Thanks for forwarding this to me. It was a good read, unfortunately there’s no R package by the authors for using their methodology. I found an earlier R package by one of the authors use part of what they described but it’s a much earlier work.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rich-iannone/pointblank/issues/246#issuecomment-796340572, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3YPELH7IGF3AHGCJYFZLDTDACHNANCNFSM4U5C6KEQ .

rich-iannone commented 3 years ago

If you could develop a reproducible method (and have it in a package on CRAN), I’d use it even if it had the rstan dependency tree (I’d make it a suggested package). Let me know if I could be of assistance, also.