Open Dior13 opened 1 year ago
My local runs are now working and creating the formatted data output, yay! I checked in the fixes for the various errors it hit. Once future people also try running this locally I can be available to help them and we can discover any other bugs on a non-identical setup probably, and also review the readmes along the way.
I am working on fields now. First thing I saw was the Race and Ethnicity mappings were swapped and white was missing from the dictionary so I'm testing that quick fix now and will check that in first before trying the first "ready to code" element.
For the Principal Diagnosis if found a map for SNOMED CT to ICD-10-CM Map available from the NIH. https://www.nlm.nih.gov/research/umls/mapping_projects/snomedct_to_icd10cm.html I had to submit a login request to access the map, which I have done now
UPDATE: We found that the mapping is not 1-1 or simple. Further, Synthea does not plan to offer ICD-10-CM output (https://github.com/synthetichealth/synthea/issues/403)
Thoughts on the homeless indicator field:
Synthea has a homelessness module.
Based on the documentation, I think periods of homelessness should be identifiable in the conditions.csv
output file as SNOMED-CT code 32911000 with associated Start and Stop dates.
We might look to see if the hospital admission date falls within any period of homelessness in that patient’s condition data… or that might be too cumbersome. Thoughts? @TravisHaussler
I will look into your question about homelessness.
Elswehere - while looking for the Present on Admission coding (which I could not find in any of the output/csv/ files) I found a new set of data that Synthea can produce using a setting in synthea_settings
https://github.com/synthetichealth/synthea/wiki/CPCDS-Export
we change exporter.cpcds.export = true and now I should be able to merge encounter IDs with cpcds claim IDs to access the PoA flags. I will try today or tomorrow to do the pandas work for that
In order to make progress getting something usable for a metrics dashboard we want to do 2 things as I understand it:
For 1 I attempted to summarize the remaining fields in a new tab on the output specs sheet so we can discuss how each one can be pursued. I added a column for proposal for you @rileeki to check over.
In particular I think some questions are:
@TravisHaussler
I reviewed your proposals in the new tab on the output specs sheet. For the most part, I agree with all of your proposals. But, I think we can get by skipping most of them if they'll be cumbersome to code. We can get some practical proof-of-concept examples without some of these fields. And, as we discussed, when we want to do something that requires the fully fleshed out data, it might be more straightforward to code a new output option in Java and have Synthea spit it out directly.
Hopefully that won't be more than a few hours of work. In the meantime, I'll work on getting details together on the ER output needs.
Remaining missing fields are:
https://docs.google.com/spreadsheets/d/1uWe-IaOa1SV7UIm4N8fs-ym2XtGVVCDMKyDE3fL8oQ8/edit#gid=1477458142
reference https://hcai.ca.gov/wp-content/uploads/2022/12/IP-format-and-file-specs-jan-2023.pdf