SORMAS-Foundation / SORMAS-data-generator

SORMAS-data-generator
GNU General Public License v3.0
3 stars 7 forks source link

[R] Control the number of cases generated #19

Open stephaneghozzi opened 3 years ago

stephaneghozzi commented 3 years ago

At the moment the number of cases generated is exactly the number of cases officially reported. However it would be interesting to tweak that number, especially for testing purposes.

One simple way of doing it is to multiply all case numbers by an integer (for each dimension: county, age, sex, with or without symptom, etc.).

(At the moment multiplying by a real number and then rounding would cause a problem when matching the two data sources: RKI's corona dashboard and RKI's SurvStat.)

stephaneghozzi commented 3 years ago

Multiplying by an integer only allows to increase the case number.

To decrease it the clean solution would be to first generate the data and then to (randomly) select the desired number of cases and remove all contacts and events (but see remark below) not linked to the cases retained. This means to different processes for increasing or decreasing the case number... To be discussed, this is not necessarily a problem.

N.B. At the moment all contacts and events are linked to at least one case. However SORMAS allows more flexibility, events can be recorded without a case having participated. That constrained could be removed from the data generator but that should be a specific issue/requirement.