Closed jeyabbalas closed 2 years ago
Let's assume that you first impute a large hgt
based on the predictors. Since hgt
and wgt
are strongly correlated, a large wgt
is likely to be imputed. Then, bmi
is deductively calculated, and the cycle reiterates. A large BMI will lead to a specific hgt/wgt
ratio, and so on. Because there are too many degrees of freedom, the system can in theory go into outer space for one or more chains.
Thank you for your prompt reply @gerkovink !
How is it even possible to even get a BMI > 300 using PMM? PMM imputes real values from the data for hgt
and wgt
in the above scenario. In the boys
dataset, the max value of hgt = 198
and wgt = 117.8
. So, in theory, the largest possible value for BMI can be: bmi = (wgt / (hgt / 100)^2) = (117.4 / (198 / 100)^2) = 29.95
. How do we get such large values imputed for bmi
?
Passive imputation does calculate the imputed value for BMI in your example deductively, not by PMM.
On Sat, Feb 26, 2022, 6:49 PM Jeya Balaji Balasubramanian < @.***> wrote:
Thank you for your prompt reply @gerkovink https://github.com/gerkovink !
How is it even possible to even get a BMI > 300 using PMM? PMM imputes real values from the data for hgt and 'wgt' in the above scenario. In the boys dataset, the max value of hgt = 198 and wgt = 117.8. So, in theory, the largest possible value for BMI can be: bmi = (wgt / (hgt / 100)^2) = (117.8 / (198 / 100)^2) = 29.95. How do we get such large values imputed for bmi?
— Reply to this email directly, view it on GitHub https://github.com/amices/mice/issues/472#issuecomment-1052375949, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABT2AKAWHR54LSSTUJN55GTU5EHB5ANCNFSM5PLBL3FA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
Right, but wgt
and hgt
is still being imputed using PMM, right? In my previous reply, I am computing BMI deductively using the formula for BMI.
But hgt and wgt can be imputed such that the highest wgt and lowest hgt lead to unrealistically large BMI; simply because their imputations are influenced by the unrestricted BMI from the previous iteration.
On Sat, Feb 26, 2022, 7:26 PM Jeya Balaji Balasubramanian < @.***> wrote:
Right, but wgt and hgt is still being imputed using PMM, right? In my previous reply, I am computing BMI deductively using the formula for BMI.
— Reply to this email directly, view it on GitHub https://github.com/amices/mice/issues/472#issuecomment-1052421623, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABT2AKE6P56RSEFCZZJYFHTU5ELNDANCNFSM5PLBL3FA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
Oh I see what you mean! So, due to the large degrees of freedom of bmi
, a large bmi
sends hgt
in the other direction. The min value of hgt
is 50. In that case, bmi = (wgt / (hgt / 100)^2) = (117.4 / (50 / 100)^2) = 469.6
. This is consistent with the numbers I am seeing.
This makes sense. Thank you so much!
Thank you to this community for building such a comprehensive and useful tool!
I went through the MICE vignettes and had a question from it. In vignette 4 "Passive imputation and post-processing", please scroll down to step 8. Pay attention to the figure in that step. This vignette warns us against circumstances that can lead to such strange imputations (BMIs well over 100s). The vignette describes this as "the problem of circularity", which I have summarized below.
The example introduces a passive imputation rule for imputing BMI values as follows:
So, the variable
bmi
now depends upon variableswgt
andhgt
. However, from the prediction matrix, we see that variableswgt
andhgt
still cyclicly depend uponbmi
. This is described as the problem of circularity.The suggested fix is simple. Simply break the cycle. Make sure that
wgt
andhgt
no longer depends uponbmi
, as shown below.I am unable to understand why the problem of circularity leads to such strange imputations and non-convergence? When I use the default PMM imputation method, I always impute realistic values for
hgt
andwgt
sampled directly from similar examples in the data. How then can I get such bizarre BMIs like in the 300s? Why does this problem not manifest when we don't use passive imputation but have cyclic dependencies?