Open jonespm opened 3 years ago
The most common time we see this problem is when a table has double the results from the previous run. (Seen in the past in submission_dim)
There was an issue today where course_dim had 0.5% less courses than the day before. This was mentioned as a case that would be worth notifying on. If something drops by a specified amount. But we'll need to store this data.
A few times we've had an issue come up where the results from the days run with 2x the previous days run. We should either
Store all of the runs in a database, something simple like 2 tables
Store just the previous days runs in a file on the filesystem, either SQLLite or just JSON. And use that to compare.
We should set a difference indicator, I feel like it could be different per table and whether it's an increase or decrease, but +- 25% seems like a good start?