OCHA-DAP / hdx-signals

HDX Signals
https://un-ocha-centre-for-humanitarian.gitbook.io/hdx-signals/
GNU General Public License v3.0
5 stars 0 forks source link

Review approach to env variables #226

Open caldwellst opened 1 month ago

caldwellst commented 1 month ago

In https://github.com/OCHA-DAP/hdx-signals/pull/211, @hannahker had the following point that I didn't have time to address in that PR scope. Putting it here as an issue to discuss and potentially resolve.

I'm not sure if this is the convention in R, but I don't love the set up of defining the variable inside a function and then importing and calling that function throughout the code base. IMO, something like dry_run <- hs_dry_run$hs_dry_run() is quite verbose and not easy to understand what's going on. I think it's also a bit redundant to have separate hs_dry_run.R and hs_first_run.R files with very short functions.

Could we consolidate the utils/ functions to have a single globals.R script that has all of our environment variables? Is it also possible to just import variables throughout the codebase instead of functions?

caldwellst commented 1 month ago

So, agree about the current complexity and simplifying. Some points though as we think this through:

I don't love the set up of defining the variable inside a function and then importing and calling that function throughout the code base... Is it also possible to just import variables throughout the codebase instead of functions?

The reason we are calling these as functions throughout the codebase is because we don't want to hardcode this at one point in time. If you had a globals.R script that just set global variables, like a <- Sys.getenv("A", unset = TRUE), then if you as a user changed A by running Sys.setenv(A = FALSE), that would not be reflected in your code that uses a unless you loaded or reloaded the module AFTER making the change.

Thus, there is significant risk that users set environment variables but those don't reflect within the code they are running. That's why these calls to Sys.getenv() are done within a function so it's always checking the current environment wherever it is used, and any user changes to their environment are captured.

I think it's also a bit redundant to have separate hs_dry_run.R and hs_first_run.R files with very short functions.

Agreed here, and we could consolidate into maybe your globals.R idea. Maybe it is a function that returns a named list of env vars that we currently have these functions for. However, I think it has to be a function based on the reasons above, because we need to check the current environment everytime someone calls globals.R. If they were hardcoded in globals.R, once we run box::use(./src/utils/globals), any changes to our environment would not be reflected in the hardcoded variables.