orchid-initiative / synthetic-database-project

MIT License
4 stars 2 forks source link

Additional Claim Types #61

Open NickKramer87 opened 10 months ago

NickKramer87 commented 10 months ago

As a data analyst, I want the synthetic database to include data from sources other than hospital discharge data so that I can write my analysis software to cover first responder data, emergency room data, ambulance data, etc.

Proposed Subtasks:

  1. Create complete list of additional claim types to be included.
  2. Add first responder data to the dataset.
  3. Add emergency department data to the dataset.

Acceptance Criteria:

  1. A complete list of desired claim types that will be added in steps 2-n.
  2. First responder data added to database.
  3. Emergency Department data added to database.
rileeki commented 10 months ago

@masonium and @NickKramer87 - can we get on a call sometime to discuss this user story? I'd like to make sure we all have the same understanding of the goal and acceptance criteria, so that Mason can scope out the level of effort required to complete it.

In my mind, this task is more about output options than it is about claim types. As of this message in Oct 2023, we are trying to output data in the HCAI inpatient data submission format (ie, the format that California hospitals use to submit quarterly data to the state). For the LARC project, we'll need output in the HCAI research data format (ie, the format of the data that eligible researchers can buy from California). The LARC project will only need the PDD (patient discharge data) file from the HCAI research data format - that's still just inpatient data, not worrying about other claim types yet. However, I think the changes that would be made to add HCAI research data format as a second output option would likely be the same concept we'd use to add other "claim types" in the future.

I think the bulk of the development task is to build a clear, modular foundation. Then, filling in the details of specific output options (like first responder and Emergency Department data) could be done by someone with deeper domain knowledge, but perhaps less coding expertise.

Does that make any sense?