synthetichealth / synthea

Synthetic Patient Population Simulator
https://synthetichealth.github.io/synthea
Apache License 2.0
2.2k stars 657 forks source link

To many deaths? #771

Closed JePrzybilla closed 4 years ago

JePrzybilla commented 4 years ago

Hello, I am using your synthea: synthea-with-dependencies.jar downloaded at 2020 june 3 on Ubuntu 20.04 LTS to generate different patient populations. I am surprised about if I am using the command: java -jar synthea-with-dependencies.jar -p 1000 I obtain finally {alive=1000, dead=225} the number of dead seems to me right. But if I use for example the following command: java -jar synthea-with-dependencies.jar -a 0-140 -p 1000 I obtain finally {alive=945, dead=2183}. That seems to me to many dead people. The same occurs with other ranges of age. What is going wrong?

Best regards.

jawalonoski commented 4 years ago

My guess is that a significant amount of those people died from covid-19 and just being crazy old. 140 is really really old. So a lot of those patients are going to die before you successfully generate one who is still living..

Even if the default max age is 140, the way the ages are generated is different if you specify them and if you do not.

If you specify the age using the -a option, the ages are selected with a uniform distribution. This is going to have a LOT of old people, who are going to die.

If you do not specify the age using the -a option, the ages are selected from the distributions in the demographics file, which are going to skew more towards young and middle aged. There will be LESS very old people, who will die less.

JePrzybilla commented 4 years ago

Thank you for your fast response. I understand. It is not an error. Best regards.

npagare commented 4 years ago

My guess is that a significant amount of those people died from covid-19 and just being crazy old. 140 is really really old. So a lot of those patients are going to die before you successfully generate one who is still living..

Even if the default max age is 140, the way the ages are generated is different if you specify them and if you do not.

If you specify the age using the -a option, the ages are selected with a uniform distribution. This is going to have a LOT of old people, who are going to die.

If you do not specify the age using the -a option, the ages are selected from the distributions in the demographics file, which are going to skew more towards young and middle aged. There will be LESS very old people, who will die less.

Hi @jawalonoski , in case not using -a, how do we make sure the generated population is above 18 years old ? I am interested in creating a volume of acute care patients across hospitals in a given city with the generated population is at randomized between 75 - 100 % of the bed capacity for each hospitals.

Similar need is to generate a population for -

  1. (-a <18 ) Children hospitals and
  2. for LTAC (long term acute care) hospitals and
  3. some for Hospice facilities

Your help will be appreciated.

Thank you.