ImperialCollegeLondon / covid19model

Code for modelling estimated deaths and cases for COVID19.
MIT License
944 stars 271 forks source link

Modularisation of pre and post processing for application to regional data #47

Closed payoto closed 4 years ago

payoto commented 4 years ago

This is a feature request but also an offer to help

Is your feature request related to a problem? Please describe. I'm adapting this code to run on data regarding French regions. The pre and post-processing is brittle (hard coded number of countries, hard coded reference countries) and would benefit from modularisation/parameterisation.

Describe the solution you'd like Functionalize and parameterize pre and post processing to tolerate arbitrary geographical regions to be run provided the data is present and in the correct format.

Describe alternatives you've considered I've already started adapting the code: https://github.com/payoto/covid19model/tree/france-regions

I will probably modularise it a bit but would have more incentive to do it properly if it can be useful to others (and facilitate maintenance of my fork).

Would you be interested in a PR going refactoring parts of the scripts to be functions?

s-mishra commented 4 years ago

Hi @payoto, we should be interested in it sometime in the future. but right now there are changes to stuff so rapidly that any change will eventually break the cutting edge stuff. Sorry but the pace of work has been very frantic and data/features we use are changing very rapidly

payoto commented 4 years ago

Ok, I get it, I think I made it sound way more disruptive than I meant. I'll take the example of argument parsing that got added 2 days ago:

https://github.com/ImperialCollegeLondon/covid19model/blob/6ff7dc0ddb8d79f0db3a017a32ff20f6912ff23d/base.r#L27-L65

To something like (in base.r):

# Commandline options and parsing
source("r-utils/arg-parser.r")
parsedargs <- base_arg_parse()
DEBUG <- parsedargs[["DEBUG"]]
FULL <- parsedargs[["FULL"]]
StanModel <- parsedargs[["StanModel"]]

## Reading all data
d=readRDS('data/COVID-19-up-to-date.rds')

with a r-utils/arg-parser.r file which looks like:

library(optparse)

base_arg_parse <- function (){
    # Commandline options and parsing
    parser <- OptionParser()
    parser <- add_option(parser, c("-D", "--debug"), action="store_true",
                         help="Perform a debug run of the model")
    parser <- add_option(parser, c("-F", "--full"), action="store_true",
                         help="Perform a full run of the model")
    cmdoptions <- parse_args(parser, args = commandArgs(trailingOnly = TRUE), positional_arguments = TRUE)

    # Default run parameters for the model
    if(is.null(cmdoptions$options$debug)) {
      DEBUG = Sys.getenv("DEBUG") == "TRUE"
    } else {
      DEBUG = cmdoptions$options$debug
    }

    if(is.null(cmdoptions$options$full)) {
      FULL = Sys.getenv("FULL") == "TRUE"
    } else {
      FULL = cmdoptions$options$full
    }

    if(DEBUG && FULL) {
      stop("Setting both debug and full run modes at once is invalid")
    }

    if(length(cmdoptions$args) == 0) {
      StanModel = 'base'
    } else {
      StanModel = cmdoptions$args[1]
    }

    print(sprintf("Running %s",StanModel))
    if(DEBUG) {
      print("Running in DEBUG mode")
    } else if (FULL) {
      print("Running in FULL mode")
    }

    parsedargs <- c(
            DEBUG=DEBUG,
            FULL=FULL,
            StanModel=StanModel 
        )
}
payoto commented 4 years ago

I'd done it for myself so I draft pulled it, in case you think it's useful; otherwise I can close it.