MIT-LCP / mimic-code

MIMIC Code Repository: Code shared by the research community for the MIMIC family of databases
https://mimic.mit.edu
MIT License
2.56k stars 1.51k forks source link

Huge inconsistencies in patientweight in inputevents #1539

Open mpierrau opened 1 year ago

mpierrau commented 1 year ago

Discussed in https://github.com/MIT-LCP/mimic-code/discussions/1535

Originally posted by **mpierrau** May 2, 2023 Hi! I have been working on MIMIC-IV for some time and recently discovered that the column `patientweight` has some extreme inconsistencies for some `stay_id` which I fail to find a reasonable explanation. There are multiple rows with `starttime` within minutes but with different `patientweight`. I understand this can happen, but the problem is that some differences are extremely big -- like tens of kilos difference. I've been parsing MIMIC-IV using Python so I don't have a nice reproducible SQL code for you, but you can have a look at for example `stay_id`'s: 39615949 (diff 170 kg between starttime 2145-10-06 20:52:00 and starttime 2145-10-06 22:31:00) 35032951 (diff 23 kg) 38680231 (diff 25 kg) 33230840 (diff 107 kg) ... Can you provide a reason for why this is? Is daily weight (itemid 224639 in chartevents) or admission weight (itemid 226512 in chartevents) a better option? Kindly, Magnus
alistairewj commented 1 year ago

Not sure without digging into more detail. My first check is usually: how often does this happen? If it's very infrequent (<1%), then it very well could be a data entry error. These weights are manually entered.

The patient weight column of that table is used for weight dependent dosing, so in order for the doses to be displayed correctly, the weights need to be correct. I'd expect the data to be fairly reasonable because of that, but who knows until you look!

vikash06131721 commented 1 year ago

Hi, Question regarding the time, why is it "2180-05-07 00:00:00" I mean 2180? . Is there any way to decode it, or has it been done purposely to not reveal details.

alistairewj commented 1 year ago

Hi, Question regarding the time, why is it "2180-05-07 00:00:00" I mean 2180? . Is there any way to decode it, or has it been done purposely to not reveal details.

Not sure if you meant to comment in this issue - but to answer your question, the years are deidentified by randomly shifting them into the future. You are not meant to decode it to a real value, though we do provide a 3-year grouping to give some idea of the time course.