mozilla / active-data-recipes

A repository of various activedata queries and recipes
Mozilla Public License 2.0
9 stars 24 forks source link

Add scheduled testing #18

Closed klahnakoski closed 6 years ago

klahnakoski commented 6 years ago

It would be nice to be notified when the schema underlying an ADR query has broken something. Travis testing may be the solution: https://docs.travis-ci.com/user/cron-jobs/

ahal commented 6 years ago

Good find, I like that! It looks like daily is the most granular we can go but that's probably good enough.

We'll likely need to write a mini test runner that invokes adr <recipe> and fails if there was an exception or if there were no results. Some recipes require command line args, so we'll want to run them multiple times with different arguments.

TrangNguyenBC commented 6 years ago

Hi, I already set up cron jobs on my fork https://travis-ci.com/TrangEmily/active-data-recipes. I understand that the script needed to run is put in to .travis.yml under script section. Please help me to define which recipes/queries will be run in mini tests. Thanks.

ahal commented 6 years ago

I think we'll want to run all the recipes, even the ones that don't work. This way we'll have a good window into what needs to be fixed.

Ideally we can set this up such that it doesn't cause the README badge to turn red on a failure, but generates a notification e-mail when a new recipe fails (or gets fixed). I'm not really sure what Travis will allow here, but we can experiment.

Feel free to just get a subset running to start if you want.

TrangNguyenBC commented 6 years ago

Hi @ahal, I am trying like this:

ahal commented 6 years ago

This proposal looks pretty good! Though I'd like to avoid the need to create a new branch if at all possible. It looks like Travis' cron tasks set an environment variable: https://docs.travis-ci.com/user/cron-jobs/#detecting-builds-triggered-by-cron

So maybe instead of a new branch we could add a step to the existing .travis.yml that runs your integration-tests.sh script. The script can just exit early if it detects it's not being run in a cron task. Or maybe we can figure out a way to do this detection directly in the .travis.yml.

Does that make sense?

TrangNguyenBC commented 6 years ago

Hi @ahal, thanks for your comment. As I understand, if the test is fail, Travis will set the badge into red. So I think if we follow that link, we have to accept that the README badge can turn red on a failure. With two branches, there are two different badges in README. There is a case when the tox is passed but the integration-test.sh is fail, if there is only one branch, the master badge will be red, while with two branches the master badge is still green while the integration-test badge is red (you will receive email notification in both ways). Please let me know which way you think is better.

ahal commented 6 years ago

Good point, I didn't think of that. But I have an idea!

Instead of creating an integration-test.sh script and a new entry in tox.ini, why don't you just create a new pytest in the existing test suite.

This could be a special pytest that is marked skipped if we're not running from a cron task. We could parametrize the test to run every recipe against activedata (maybe using subprocess to make sure we test the entire thing).

The benefit of doing this is that if a recipe starts to fail, we'll get the notification and we can mark it as xfail (expected fail in pytest), and the suite will start passing again. That way if it starts passing again, we'll also get a notification and can re-enable the test.

This will cause the master badge to flicker between red/green occasionally, but I think that's fine. After all, we want to try and keep the recipes as stable as possible, and if a bunch of them aren't working, then signaling that the repo isn't stable might actually be a good thing.

TrangNguyenBC commented 6 years ago

The benefit of doing this is that if a recipe starts to fail, we'll get the notification and we can mark it as xfail (expected fail in pytest), and the suite will start passing again. That way if it starts passing again, we'll also get a notification and can re-enable the test.

Hi @ahal I am implementing the way you suggest. But still have a problem with this case: first time the recipe fails, you will receive an email notification about "fail" result; the second time (after we mark it as xfail) you will receive an email notification about "pass" result (although the recipe is still broken); the third time (after we fix the bug in the recipe), the recipe really passes, but you will not receive the email notification that time. The reason is by default, Travis sends email in these situations: a build was just broken or still is broken, a previously broken build was just fixed, so you will not receive notification when the recipe is fixed (passing again). Is that ok?

ahal commented 6 years ago

Yeah, I think that's ok. I don't rely on the e-mails too much, and we can always glance at the test file to see which scripts are marked as expected fail and still need to be fixed.

If we try it out and it turns out to be a problem, maybe we can move to your two branch approach in the future. But I'd like to start as simple as possible and only move to something more complicated if we need to.

TrangNguyenBC commented 6 years ago

Hi @ahal I just sent a PR. These are the result on Travis (my fork):

I still have to work more on "running all recipes multiple times with different arguments". However, I think I should submit current code first, so you can give me comments before I try to dig into recipe arguments. I also thought about creating different ".test" files in test/recipe_integration_tests directory, each recipe has its corresponded test file which covers all possible cases of arguments (the same way with structure of test/recipe_tests directory). Please have a look and tell me what should I do next. Thank you.

ahal commented 6 years ago

ActiveData did the migration last weekend and something seems to have broken the recipes. So if you notice it's all failing now, that's expected. Kyle's looking into it. We can land this once the issue is cleared up.

TrangNguyenBC commented 6 years ago

Yes I am ok with that. Some of the integration tests are passed again, but seta_accuracy unit test is failed.

ahal commented 6 years ago

I accidentally had this set to run monthly, which is why I haven't seen results. Changed to daily and looks like everything is working as expected.