orchid-initiative / synthetic-database-project

MIT License
4 stars 2 forks source link

Automatic Summary Generation #59

Open NickKramer87 opened 10 months ago

NickKramer87 commented 10 months ago

As a data analyst, I want the database generation tool to automatically generate summary statistics for the patient files it creates so that I can easily compare them to existing summary files of real data.

Proposed Subtasks:

  1. Generate a summary sheet in excel that is similar, if not identical, to the summary sheets generated by HCAI.

Acceptance Criteria:

  1. A summary sheet that can be read by the same program that reads real-world patient data summaries, as it is functionally identical.
  2. The summary file can be accessed by clicking on the link in the repository's README. See this screenshot: Image
rileeki commented 10 months ago

The summary sheets generated by HCAI can be found here: https://data.chhs.ca.gov/dataset/hospital-inpatient-characteristics-by-facility-pivot-profile

rileeki commented 9 months ago

I just discussed this task with @andrewkoji. He's interested in taking a look, starting by exploring in a Jupyter notebook rather than getting all set up with the developer guide instructions.

Start by playing with this csv: https://github.com/orchid-initiative/synthetic-database-project/blob/main/csv_formatted_data_09-11-2023_134827.csv