theonesp / p_zero-code

MIT License
2 stars 1 forks source link

QC IDEAS #6

Open theonesp opened 2 years ago

theonesp commented 2 years ago

Distributions and missingness using SQL

select dt as date, ct as bookings,
repeat('█', 20 * ct / @max_ct) chart from ( select date(date_created) dt, case when isnull(@max_ct) or @max_ct < count(1) then @max_ct := count(1) else count(1) end ct from booking_foo group by 1 order by 1 desc limit 10) t

OUTPUT:

+------------+----------+--------------------------------------------------------------+ | date | bookings | chart | +------------+----------+--------------------------------------------------------------+ | 2014-09-26 | 2 | ███ | | 2014-09-25 | 5 | ████████ | | 2014-09-24 | 5 | ████████ | | 2014-09-23 | 10 | █████████████████ | | 2014-09-22 | 12 | ████████████████████ | | 2014-09-21 | 5 | ████████ | | 2014-09-20 | 9 | ███████████████ | | 2014-09-19 | 6 | ██████████ | | 2014-09-18 | 7 | ████████████ | | 2014-09-17 | 8 | █████████████ | +------------+----------+--------------------------------------------------------------+

theonesp commented 2 years ago

{pointblank} for validation. It notifies you if your tests fail (ex: values out of range), and provides a csv to review cases that fail each test! It works great in conjunction with a data dictionary where you have predefined expectations for your vars! #rstats #edresearch https://twitter.com/Cghlewis/status/1516876982717399047?s=20&t=2j1IGuiyhseLfWGX837ITw