OCHA-DAP / hdx-signals

HDX Signals
https://un-ocha-centre-for-humanitarian.gitbook.io/hdx-signals/
GNU General Public License v3.0
6 stars 0 forks source link

Clarify `test` and `GMAS_TEST_RUN` variables #94

Closed hannahker closed 5 months ago

hannahker commented 6 months ago

For simplicity's sake, can we consolidate a "testing" toggle into a single variable? Looking at the documentation here, it's a little confusing to me how these two interact. For example, if this test flag is TRUE on a run, what exactly should happen?

caldwellst commented 6 months ago

I'm leaning no, but happy to be convinced otherwise. I'll go through my reasoning and how I use them. Mainly it is because I think GMAS_TEST_RUN is not really a a test toggle as such, and could do with renaming, and I think having the 4 options they provide together are quite nice for development and testing. However, once described interested to hear how we could combine or simplify but still maintain the functionality they currently provide.

However, I'll let that all be up for discussion, and just describe how I think at least having the 4 toggles are very useful. Let's also use this to decide what to rename GMAS_TEST_RUN to.

So, I use test to automatically generate signals, without having to worry about if there is actually any new runs. It ensures we get signals to validate, by changing filter_alerts() to include latest signal from randomly sampled ISO3 values (or set ISO3 codes using test_filter) and bypasses all the checks and validation on existing data in the signals.parquet files. Otherwise, all the functions are designed to not run if there are no new signals, or existing signals, or any number of reasons. It's all about seeing the final content in the email and how it looks when sent, so it makes sure we have stuff that generates a campaign and allows us to preview it.

GMAS_TEST_RUN was originally just used to allow me to develop and test locally without any fear of messing things up or spending money. The only thing it really does is stop functions that save to the cloud, ping AI, and save to Mailchimp. It does this at the last minute, typically on calls to external functions that interact outside the repo.

This, I think?, makes it really nice in that you can basically do end to end test runs of the system, excluding the final saving functions (which we could unit test). However, I also really like it for local development.

So here's how the combination works, and how I find it useful:

So yeah, that's how they are used. I understand not entirely clear, but hopefully the use cases are a bit clearer at least. I'm open to changing how we call these, but would ideally have the use cases above still available to us for development. And importantly, I decided to set the GMAS_TEST_RUN indicator as environmental variable because I thought it made it much less likely you would accidentally set it and then run code, whereas a parameter you might much more easily do.

Let me know what y'all think!

hannahker commented 6 months ago

@caldwellst thanks for this great summary -- makes a lot more sense to me now and I agree that it would be challenging to get all this flexibility out of a single variable. Here are some of my initial thoughts on how we might make things clearer and easier:

Maybe this can be streamlined more in the future, but I can see how this kind of flexibility is important while the project is under such active development.