calogica / dbt-expectations

Port(ish) of Great Expectations to dbt test macros
https://calogica.github.io/dbt-expectations/
Apache License 2.0
989 stars 120 forks source link

Update expect_column_values_to_be_within_n_stdevs.sql #296

Closed jroldan3 closed 7 months ago

jroldan3 commented 7 months ago

I was trying this test out but it seems it wasn't detecting the outlier i have mocked up and I was not getting what I am expecting. for an example I generated n=50 (samples) with values on the tens i.e. 10,32,51,55 etc... then I added out one massive outlier 10000 but it kept on passing the test. I created my own custom test myself but just thinking am i using it correctly or is it actually incorrect... just thought to propose the change if necessary as i couldn't find any documentation for this particular test used anywhere. thus the suggestion to modify.

so the approach i am suggesting rather than using z score to set threshold, use the number of standard deviations away from the mean as the sigma threshold. thoughts?

thank you

Issue this PR Addresses/Closes

Closes #(Issue Number)

If you don't have an issue #, please first open an issue on the repo before submitting a PR to discuss the changes you'd like to make.

Summary of Changes

(Succint summary of the changes introduced by this PR)

Why Do We Need These Changes

(Short description why this PR is necessary)

Reviewers

@clausherther