Closed mworni closed 12 years ago
in these cases i would recode since it is plausible that a period was missing. now, reproducible research standard would say that your recoding should happen within the script, since then everything is documented, but we are still not at the "religious" stage. that said, you can always recode outside the script if you find it easier. would definitely recommend some graphics to detect outliers like that. as a side comment, Joao and I are developing a number of standard procedures for really dirty data
On Tue, Jun 12, 2012 at 2:21 PM, mworni < reply@reply.github.com
wrote:
Tony, you might have to check some BMI values and decide what to do with them. There are BMIs of >140 - even 214 is available. What do you want to do with them - set as missing? I can hardly imagine that patients with BMI's over 50 get a lung transplant...
Reply to this email directly or view it on GitHub: https://github.com/rpietro/airwayDehiscence/issues/6
Should probably set as missing all those not possible. Less than 10 and over 50 impossible I would think.
On Jun 12, 2012, at 1:21 PM, mwornireply@reply.github.com wrote:
Tony, you might have to check some BMI values and decide what to do with them. There are BMIs of >140 - even 214 is available. What do you want to do with them - set as missing? I can hardly imagine that patients with BMI's over 50 get a lung transplant...
Reply to this email directly or view it on GitHub: https://github.com/rpietro/airwayDehiscence/issues/6
i think you guys already know this, but to recode you would write something like
qplot(BMI_RECIP) #will show you that this variable is completely skewed due to the miscoding
BMI_RECIP_clean <- recode(BMI_RECIP, " BMI_RECIP[BMI_RECIP>100] = NA " ) #the square brackets are subsetting the variable, basically just saying something like take BMI_RECIP but just the subset that is greater than 100 and then set that to NA (which means missing for numeric variables). noticed that rather than recoding the original variable, i created a new variable called BMI_RECIP_clean since we might need the old variable later on, like for trying different recoding strategies
qplot(BMI_RECIP_clean) # this will show you that the recoding worked
i inserted the code above in the script
Tony, you might have to check some BMI values and decide what to do with them. There are BMIs of >140 - even 214 is available. What do you want to do with them - set as missing? I can hardly imagine that patients with BMI's over 50 get a lung transplant...