ScottishCovidResponse / SCRCIssueTracking

Central issue tracking repository for all repos in the consortium
6 stars 0 forks source link

Population status possibly recorded in an inconsistent way #742

Open peter-t-fox opened 3 years ago

peter-t-fox commented 3 years ago

The population status is the following data structure:

struct Status
{
    std::vector<int> simulation;                /*!< Number of incident (newly detected/reported) cases of covid in each simulation step. */
    std::vector<int> deaths;                    /*!< Overall number of incident deaths due to covid in each simulation step. */
    std::vector<int> hospital_deaths;           /*!< Number of incident deaths reported at hospital due to covid in each simulation step. */
    std::vector<Compartments> ends;             /*!< Population per epidemiological state and per age group on last day. */
    std::vector<std::vector<Compartments>> 
        pop_array;                              /*!< Population per epidemiological state and per age group on every day. */
};

Each model initialises the status of its population in the following way:

Status status = {{0}, {0}, {0}, {}, {}};

The simulation (cases), deaths and hospital_deaths are prefixed with a zero entry. The ends and pop_array are not. This leads to inconsistent outputs e.g. the predictive mode ends and simu files will have a different number of entries.

It's not clear if there is a rationale for this, or if it is a mistake.

github-actions[bot] commented 3 years ago

Heads up @thibaud-porphyre @peter-t-fox - the "Covid19_EERAModel" label was applied to this issue.

thibaud-porphyre commented 3 years ago

I am not clear what you mean here. it is entirely normal for the predictive mode ends and simu files to have a different number of entries given that they records complete different things and for different time steps.

peter-t-fox commented 3 years ago

Okay, if we look at the OriginalModel::Run code, I can describe what I think the problem is:

    Status status = {{0}, {0}, {0}, {}, {}};

    const int n_agegroup = ageGroupData_.waifw_norm.size();

    // Start without lockdown
    bool inLockdown = false;

    // Assumes that the number of age groups matches the size of the 'agenums' vector
    std::vector<double> seed_pop = BuildPopulationSeed(ageNums_);

    std::vector<std::vector<double>> parameter_fit(ageGroupData_.waifw_norm.size(), parameter_set); 
    parameter_fit[0][5] = fixedParameters_[0].juvp_s;

    std::vector<Compartments> poparray = BuildPopulationArray(ageNums_, seedlist);

    for (int tt{1}; tt < n_sim_steps; ++tt) {
        //initialize return value
        InfectionState infection_state; 
...
            status.pop_array.push_back(poparray);
            status.simulation.push_back(infection_state.detected); 
            status.deaths.push_back(infection_state.deaths); 
            status.hospital_deaths.push_back(infection_state.hospital_deaths);
    }

The definition of the Status data structure is:

struct Status
{
    std::vector<int> simulation;                /*!< Number of incident (newly detected/reported) cases of covid in each simulation step. */
    std::vector<int> deaths;                    /*!< Overall number of incident deaths due to covid in each simulation step. */
    std::vector<int> hospital_deaths;           /*!< Number of incident deaths reported at hospital due to covid in each simulation step. */
    std::vector<Compartments> ends;             /*!< Population per epidemiological state and per age group on last day. */
    std::vector<std::vector<Compartments>> 
        pop_array;                              /*!< Population per epidemiological state and per age group on every day. */
};

We can see that the status is initialised to {{0}, {0}, {0}, {}, {}}, so we are saying that there is zero disease incidence on day 0. However, the pop_array member is initially empty.

Each time through the simulation loop, the state of the population at the end of the loop is recorded (by pushing back onto the pop_array data structure). Since the simulation loop starts on day 1, but we record incidence starting from day 0, the effect is that the pop_array data structure is out of alignment with the incidence data structures. To put it another way, if we do a 100-day simulation, we will have 100 incidence recordings, but only 99 population states.

I think that the solution would be to record the initial state of the population, on day 0, before going in to the simulation loop. We could do this by adding the following line after we build the seeded population:

    std::vector<Compartments> poparray = BuildPopulationArray(ageNums_, seedlist);

        status.pop_array.push_back(poparray);   // <---------------------- New line

    for (int tt{1}; tt < n_sim_steps; ++tt) {
        //initialize return value
        InfectionState infection_state; 

Then I think that the data structures output by the model would be consistent with each other.