Open peterkasson opened 4 years ago
Yes, for me this is definitely an unresolved issue.
The simulation should give as output where the transmission happened depending on configuration.
Assume everything but household size being constant. With eight people/household more transmissions should happen in households than with two (or lets even assume close to 1). Scaling in advance to always get the 1/3 of infections in households looks very wrong to me.
South East Asia maybe had 1/3 at work/school, 1/3 home, and 1/3 community. The small household sizes in Sweden is an international exemption, it should show in simulation runs. But not only the balance matters, if it were we could use double betac_scale = 8.4/8.4, betah_scale = 2.0/8.4, betaw_scale = 1.0/8.4, R0_scale=2.2;
Balance will be "right" but infections go down drastically. Why? R0 for community would stay at 2.2
The betac_scale = 8.4 effectually increases the R0 for community transmissions to 8.4*2.2 = 18.48 ! in the name of keeping a balance between community/house/work infections!
One infected person coming home to a home where eight others live or where one other live. Eight or One other could be randomly infected, but for one of the eight the probability should be the same as for the single person. Thus there should be higher risk that at least one of the eight person household to be infected than with single other person - the risk should not scale with average number of persons in household, to make the risk for one infected per household to become infected equal on average...
I think these are the correct parameters: double betac_scale = 1.0, betah_scale = 1.0, betaw_scale = 1.0, R0_scale=2.2;
One more thing - if it should scale with population or houshold sizes changing those parameters with options should affect the scaling, they don't now double betac_scale = 84000000.0/population, ...
Here's the theory. I'm then going to look at the code. I think you are right that if the theory below is correct then there should be a scaling factor if the household size distribution or the population size is changed
In the absence of robust contact-tracing data to estimate household vs. workplace vs. community transmission, a uniform prior would be 1/3, 1/3, 1/3. So it's a reasonable starting point.
The Imperial study (Mar 16 report) used 1/3, 1/3, 1/3 for Great Britain. They provide some additional discussion there. So one question would be whether to keep the specific transmission factors or to adapt them to maintain 1/3, 1/3, 1/3.
There is a study from Mossong et al. (doi: 10.1371/journal.pmed.0050074) that did a prospective study of contact patterns across 8 European countries. This unfortunately did not include Sweden, but did include Finland. In that study, the overall balance of contacts did not vary substantially across countries. This supports the idea of maintaining an even distribution in the absence of specific contact-tracing data to the contrary.
Does this make sense?
Not really. If the model is correct, and this assumption is correct. You should see 1/3 1/3 1/3 as OUTPUT.
You assume it should be like this and scale in advance to get this. What if I specifically set up a scenario to not get 1/3 1/3 1/3?
"I think you are right that if the theory below is correct then there should be a scaling factor if the household size distribution or the population size is changed" Suppose I only modify mean household size, and we have betah_scale rescale because of that
But to get this balance you also effectively change the transmission rate between two people (as dividing all by a constant gives a different result)
Ah--I should be clear. This is the output for the "natural history of infection" scenario. It is not actively scaled for any of the behavioral modifications i.e. interventions. So the fraction of transmission in each setting does change across the interventions, as would be logically expected.
So it in part depends on the use scenario: whether household size and population size are viewed as effectively constant or whether they are being systematically varied in the course of a run.
Because population, betac, betah, and betaw are all parameters that can be adjusted at command-line, I think it's most reasonable to document the choices made and let the user decide whether to choose something different.
I am only talking about intervention=0
Do you agree that different mean household sizes should alter the number of infections at home? (same population)
"Household size is approximately 2X smaller in Sweden, 2 vs. 4, attributed as a 2X increase in household transmission factor." betah_scale = 2.0 Edit3: Assume a different Sweden with normal household size of 8 - everything else the same betah_scale = 0.5
Edit: Half the household size => double the transmission factor... ??? Edit2: Norway and Finland have about half the population of Sweden betac_scale=16.8
PS I think there may be a semantic ambiguity here that is causing confusion. Will check the code and papers tonight.
Ok. I think "transmission" is misleading. Consider the following: Each infected individual has contacts with others that may result in transmission. The overall # of contacts can be taken to be divided evenly 33% household, 33% workplace, 33% community: this is both the uniform prior and is supported by the epidemiological data reported in Mossong et al., 2008. They found that the proportion of contacts did not vary substantially across European countries (included Germany at 2.0, GBR at 2.3, Finland at 2.8, Poland at slightly over 2.8... So the epidemiological data support the idea that the number of contacts is more or less independent of these factors.
Thus the best practice is to set betah, betac, betaw factors such that the "infectious potential" for each infected individual (number of contacts * pairwise contact probability summed over all pairwise contacts) is evenly divided.
This is done once at the beginning; the R0_scale factor is parameterized afterwards on the data March 21 - April 6 to reproduce the overall growth rates (as discussed in the supplement of our paper).
I believe this is formally correct. I will update the comments in the code to document this clearly.
So, if there is no real difference between GBR and Finland, why the need to scale?
Then model should show in output that 33% if not there is some other model problem, just adding a scaling factor to get that 33% must be wrong. If not, why not scale down instead to reach the balance? Edit: if these factors are cancelled out by the R0_scale factor, then why do I get so different results when scaling down?
I am perfectly fine with all countries should get 33%, what I have problem with is how it is archived!
Are you saying that this factor is because Swedes and Finnish people living in smaller households hug a lot more, and that is what this factor trying to explain? The factors DO increase the transmission you need to explain WHY, to get 33% is not an explanation.
The R0_scale factor covers overall infectiousness. That's the one parameter that was adjusted in a fitting procedure. The beta parameters are set according to prior data. Just to be clear about what is considered a free parameter in our models (as opposed to the code, which is a little more flexible).
Let me try to explain the 33% as follows: Consider that actual transmission occurs on a stochastic discrete contact graph (let me know if this clarifies or muddies things). i.e. each day there are a bunch of stochastic "edges" between the individual and others that represent contacts and carry probability of transmission. What the above-referenced study says is that those edges should be on average (across both time and people) evenly distributed between household contacts, workplace contacts, and community contacts. Now, we are calculating infection using a Monte Carlo procedure (good--stochastic), but for uninfected person i we include a term for each infected person j sharing household, workplace, or community. So the infected person j should have stochastic edges that are on average evenly distributed among the three settings, but the terms in the model include potential edges to each person in those settings. Therefore the beta terms are set to weight the expectation value of edges to be even across the three.
Put in real-world analogy terms: If Sven is infected, in a given day he might spend N hours in close contact with people in the household, M hours in close contact with people in the community, Q hours with people in the workplace. The idea is that N, M, and Q are, averaged across time and population, independent of the size of the household. So if the household is smaller, the relative probability of contacts to each person gets larger.
Or to say this a third way: if there are fewer people in a household and each person gives the same number of hugs per day, the likelihood that any person in a household receives a hug from the one infected person goes up.
I took household as my primary example as it is easiest to reason about, and I can accept your reasoning with the reservation that the actual factor needs to be measured not assumed - could it be 1.5 rather than 2? "household nearness factor" But a 4 person household in both countries will not be scaled the same, I still have a problem with that...
Same thing with the community factor
But two identical cities far from other cities but where only the whole country population differs, will be treated differently and not only by the bigger number of people at a distance but by a factor that matters a lot
There must be a better way; make household sizes and country population matter less, city size maybe should matter more (big cities rely on public transportations in a way small don't)
The population density should be accounted for in the normalization of the gravity kernel. If you examine the equation for community transmission (in the supplement of the paper), there's a distance dependence and a distance normalization. So this will account for density. If you have a numerical counterexample using these equations, we'd be happy to take a look.
So you are saying that two persons, at equal distance, in identical size/density cities..., with only difference Sweden (betac_scale=8.4) or Finland (betac_scale=16.8) should affect each other the same with your model?
And if Finlands betac_scale factor should not be scaled up twice from Swedens you need to explain why (their population is half of Swedens).
How can we test this... Lets imagine a different Sweden with the north most and south most Swedish towns gone, => betac up and population down And then only look at Stockholm, what would you expect?
Do you agree with this test setup, am I missing something?
Reproducing query from earlier issue: "/ Modified factors to account for difference between population in Southeast Asia from Ferguson Model and Swedish population. This adjustment maintains 1/3 transmission in community, 1/3 transmission in household, and 1/3 transmission in workplace. Overall population is 8.4X smaller in Sweden, attributed as a 8.4X increase in community transmission factor. Household size is approximately 2X smaller in Sweden, 2 vs. 4, attributed as a 2X increase in household transmission factor. R0 Scaling factor used to set approximate doubling time. R0_scale=2.2 attributes to a doubling time of approximately 3 days while R0_scale=1.8 attributes to a doubling time of approximately 5 days. /"
Roger--I'm not sure this was properly answered earlier, and I'd like to make sure this is fully addressed. Am I correct in thinking that there may be unresolved questions in your mind?