There seems to be an issue with rescaling in cv.Strain.apply()

pausz commented 3 years ago

Describe the bug

There seems to be an issue with rescaling in cv.Strain.apply(). In immunity.py, line 121, the variable rescale_factor changes value. This is in a setting, where pop_scale has been specified. I'm unsure if this is expected behavior and I'm not understanding how rescaling works, but it seems odd. The rescale_factor starts as 1.0, then goes through an intermediate value and then settles in the value given in pop_scale. As a result, the first time we apply cv.strain(), we infect n_imports agents, rather than a value close to n_imports/rescale_factor.

To reproduce

Steps to reproduce the behavior:

For model issues:

import covasim as cv
cv.check_version('>=3.0.2', die=True)

import numpy as np

# start date of simulation
start_date = '2021-01-22'
# end date of simulation
end_date = '2021-12-31'

# Queensland age data - ABS September 2020
qld_age_data = {
   '0-4':  314602,
   '5-9':  339247,
  '10-14': 345205,
  '15-19': 319014,
  '20-24': 338824,
  '25-29': 370468,
  '30-34': 362541,
  '35-39': 354219,
  '40-44': 325208,
  '45-49': 348003,
  '50-54': 321168,
  '55-59': 317489,
  '60-64': 288317,
  '65-69': 254114,
  '70-74': 226033,
  '75-79': 156776,
  '80-84': 100692,
  '85-89': 57073,
  '90-94': 28134,
  '95-99': 7931,
  '100-104':1128
  }

# Update with qld population stats
cv.data.country_age_data.data['Australia'] = qld_age_data
import_dates = ['2021-01-22',
                '2021-02-22',
                '2021-03-22',
                '2021-04-22',
                '2021-05-22',
                '2021-06-22',
                '2021-07-22',
                '2021-08-22',
                '2021-09-22',
                '2021-10-22',
                '2021-11-22',
                '2021-12-22']

# Define list with imports 
imported_infections = [cv.strain('uk', days=cv.day(day, start_day=start_date), n_imports=25, rescale=True) for day in import_dates]

# simulation parameters
pars = {
    'pop_size': 200000,
    'pop_type': 'hybrid',
    'pop_scale': 25.5,    # Population scales to 5.1M ppl in QLD
    'location': 'Australia',
    'rand_seed': 42,      # Random seed to use
    'beta': 0.0113,       # QLD calibration global beta
    'pop_infected': 0,
    'n_imports': 0,
    'n_strains': 1,
    'strains': imported_infections,
    'start_day': start_date,
    'end_day':   end_date,
    'use_waning': True}

# Baseline simulator - no interventions
sim_baseline = cv.Sim(pars)
sim_baseline.initialize()        # Initialize models

sim_baseline.run()

Expected behavior

In the case pop_scale is specified, I'd expect the resulting number of imports of the specific strain, to be the number of imports specified in cv.strain() divided by pop_scale, throughout the whole simulation. For the specific values used in this example, the resulting ratio is 0.98-ish. I'd expect that sc.randround() would produces values of 0 or 1, but not larger than 1.

Screenshots or outputs I added some good ol' print statements :see_no_evil:

 python test_apply_strain.py 
Covasim 3.0.2 (2021-04-26) — © 2021 by IDM
Initializing sim with 200000 people for 364 days
Loading location-specific data for "Australia"
  Running 2021-01-01 ( 0/364) (0.00 s)  ———————————————————— 0%
  Running 2021-01-11 (10/364) (0.77 s)  ———————————————————— 3%
  Running 2021-01-21 (20/364) (1.53 s)  •——————————————————— 6%

Rescale factor:  1.0
Rescaled number of imported infections via sc.randround():  25

The rescale factor is 1.0, then 1.0 again, then 3.5831807999999996

  Running 2021-01-31 (30/364) (2.30 s)  •——————————————————— 8%
  Running 2021-02-10 (40/364) (3.09 s)  ••—————————————————— 11%
  Running 2021-02-20 (50/364) (3.90 s)  ••—————————————————— 14%
Rescale factor:  1.0
Rescaled number of imported infections via sc.randround():  25
  Running 2021-03-02 (60/364) (4.72 s)  •••————————————————— 17%
  Running 2021-03-12 (70/364) (5.58 s)  •••————————————————— 19%
  Running 2021-03-22 (80/364) (6.49 s)  ••••———————————————— 22%
Rescale factor:  3.5831807999999996
Rescaled number of imported infections via sc.randround():  7

and finally it takes the value of pop_scale

  Running 2021-04-01 (90/364) (7.40 s)  ••••———————————————— 25%
  Running 2021-04-11 (100/364) (8.31 s)  •••••——————————————— 28%
  Running 2021-04-21 (110/364) (9.37 s)  ••••••—————————————— 30%
Rescale factor:  25.5
Rescaled number of imported infections via sc.randround():  1
  Running 2021-05-01 (120/364) (10.70 s)  ••••••—————————————— 33%
  Running 2021-05-11 (130/364) (12.32 s)  •••••••————————————— 36%
  Running 2021-05-21 (140/364) (14.15 s)  •••••••————————————— 39%
Rescale factor:  25.5
Rescaled number of imported infections via sc.randround():  1
  Running 2021-05-31 (150/364) (16.07 s)  ••••••••———————————— 41%
  Running 2021-06-10 (160/364) (17.99 s)  ••••••••———————————— 44%
  Running 2021-06-20 (170/364) (19.88 s)  •••••••••——————————— 47%
Rescale factor:  25.5
Rescaled number of imported infections via sc.randround():  1
  Running 2021-06-30 (180/364) (21.74 s)  •••••••••——————————— 50%
  Running 2021-07-10 (190/364) (23.58 s)  ••••••••••—————————— 52%
  Running 2021-07-20 (200/364) (25.40 s)  •••••••••••————————— 55%
Rescale factor:  25.5
Rescaled number of imported infections via sc.randround():  1
  Running 2021-07-30 (210/364) (27.23 s)  •••••••••••————————— 58%
  Running 2021-08-09 (220/364) (29.03 s)  ••••••••••••———————— 61%
  Running 2021-08-19 (230/364) (30.85 s)  ••••••••••••———————— 63%
Rescale factor:  25.5
Rescaled number of imported infections via sc.randround():  1
  Running 2021-08-29 (240/364) (32.65 s)  •••••••••••••——————— 66%
  Running 2021-09-08 (250/364) (34.51 s)  •••••••••••••——————— 69%
  Running 2021-09-18 (260/364) (36.34 s)  ••••••••••••••—————— 72%
Rescale factor:  25.5
Rescaled number of imported infections via sc.randround():  1
  Running 2021-09-28 (270/364) (38.17 s)  ••••••••••••••—————— 74%
  Running 2021-10-08 (280/364) (39.92 s)  •••••••••••••••————— 77%
  Running 2021-10-18 (290/364) (41.81 s)  •••••••••••••••————— 80%
Rescale factor:  25.5
Rescaled number of imported infections via sc.randround():  1
  Running 2021-10-28 (300/364) (43.61 s)  ••••••••••••••••———— 82%
  Running 2021-11-07 (310/364) (45.37 s)  •••••••••••••••••——— 85%
  Running 2021-11-17 (320/364) (47.17 s)  •••••••••••••••••——— 88%
Rescale factor:  25.5
Rescaled number of imported infections via sc.randround():  1
  Running 2021-11-27 (330/364) (49.06 s)  ••••••••••••••••••—— 91%
  Running 2021-12-07 (340/364) (50.94 s)  ••••••••••••••••••—— 93%
  Running 2021-12-17 (350/364) (52.80 s)  •••••••••••••••••••— 96%
Rescale factor:  25.5
Rescaled number of imported infections via sc.randround():  1
  Running 2021-12-27 (360/364) (54.61 s)  •••••••••••••••••••— 99%
Simulation summary:
   5462224 cumulative infections
   999249 cumulative reinfections
   5462240 cumulative infectious
   3643950 cumulative symptomatic cases
   496059 cumulative severe cases
   156569 cumulative critical cases
   5405108 cumulative recoveries
   56629 cumulative deaths
       0 cumulative tests
       0 cumulative diagnoses
       0 cumulative quarantined people
       0 cumulative vaccinations
       0 cumulative vaccinated people

Platform:

covasim version:3.0.2
sciris version: 1.1.1
OS: Debian GNU/Linux 10 (buster) - Linux 4.19.0-14-amd64/Debian 4.19.171-2 (2021-01-30) x86_64 GNU/Linux
Other platform information:
- Anaconda3 2019-03,
- conda 4.10.1
- env with python 3.8.8,
- covasim installed from fork via pip install -e;

cliffckerr commented 3 years ago

@pausz Thanks! I believe this is correct. When you start off the simulation, 1 agent = 1 person. So 25 imported infections is 25 infected agents. As the scale factor increases, 1 agent > 1 person. In the limit, 1 agent = 25.5 infected people. So then 25 imported infections = 0.96 infected agents. This will round to 1 almost all the time, but very occasionally 0. If you set pop_scale to be 50, then it should be 1 half the time and 0 half the time. Each infected agent will correspond to 50 imported infections, and you can't infect half an agent, so it should be 0 (=0) half the time and 1 (=50) half the time, averaging out to 25.

pausz commented 3 years ago

@cliffckerr excellent, thank you for the explanation! And yes, if I set pop_scale to 50, it does rescale_factor 1.0, 1.0, 3.58 ... & and then 50, so that in the limit of t-> inf, the number of imports is 1 half the time and 0 half the time.

cliffckerr commented 3 years ago

Expected behavior; closing

InstituteforDiseaseModeling / covasim

There seems to be an issue with rescaling in cv.Strain.apply() #312