orchid-initiative / synthetic-database-project

MIT License
4 stars 2 forks source link

Massive Database Generation Study #77

Open NickKramer87 opened 10 months ago

NickKramer87 commented 10 months ago

As a data analyst, I would like to have access to a US-sized database of synthetic patients from which I can draw whichever ones are applicable to my current study.

Proposed Subtasks:

  1. Determine the viability of creating a large (350 million) database of synthetic patients so that generation need only be performed once.
  2. If that is possible, determine the best way to create that database (parallel cpus, gpus, outsource to a server, etc.)
  3. Generate enormous database.

Acceptance Criteria:

  1. A report outlining whether or not a large database is possible and the plan for creating one.
  2. The fully generated database.