MIT-LCP / mimic-code

MIMIC Code Repository: Code shared by the research community for the MIMIC family of databases
https://mimic.mit.edu
MIT License
2.51k stars 1.5k forks source link

Update row counts in validate scripts for mimic-iv-ed v2.2 #1476

Closed ZhipengHe closed 1 year ago

ZhipengHe commented 1 year ago

Hi there, This pull request is to solve the issue #1475.

MIMIC-IV-ED v2.2 removed a subset of subject_id which will be retained internally as a test set. The document said Final row counts are available in the validation scripts published with the MIMIC Code Repository. However, the row counts are not updated for MIMIC-IV-ED v2.2 (Issue #1475)

I update row counts for validate.sql for both mysql and postgres, and save original validation scripts as validate_old.sql

Validate with new scripts:

    tbl    | expected_count | observed_count | row_count_check
-----------+----------------+----------------+-----------------
 diagnosis |         899050 |         899050 | PASSED
 edstays   |         425087 |         425087 | PASSED
 medrecon  |        2987342 |        2987342 | PASSED
 pyxis     |        1586053 |        1586053 | PASSED
 triage    |         425087 |         425087 | PASSED
 vitalsign |        1564610 |        1564610 | PASSED
(6 rows)
alistairewj commented 1 year ago

I removed the "old" scripts since the git history can keep track of validation scripts for the old versions. I also tidied up one comment. Thanks!