jamesjiadazhan / dietaryindex

R Package for Calculating Healthy Eating Index-2020 (HEI2020), Alternative Healthy Eating Index (AHEI), Dietary Approaches to Stop Hypertension (DASH), Mediterranean Diet (MED), Dietary Inflammation Index (DII), and Planetary Health Diet Index from the EAT-Lancet Commission (PHDI) for the NHANES, ASA24, DHQ, and other dietary assessments
https://jamesjiadazhan.github.io/dietaryindex_manual/
MIT License
100 stars 15 forks source link

Possible issue calculating HEI2015 from ASA24 dataset with missing data? #132

Closed alaila1 closed 1 year ago

alaila1 commented 1 year ago

I have a dataset that has missing data. I am trying to calculate HEI based on ASA24 data, but I keep getting this error:

> Error in if (COHORT$TOTALKCAL[i] == 0) { :

> missing value where TRUE/FALSE needed`

Looking at the ASA24_exp data, there's no missing data there while my dataset does have missing data for some participants. I am wondering if there's an issue with missing data.

The following is the code I use:

#Define ASA24 data from a bigger dataset

Totals <- data.frame(UserID,
                        KCAL, 
                        F_TOTAL,
                        F_CITMLB,
                        F_OTHER,
                        V_TOTAL,
                        V_DRKGR,
                        V_LEGUMES,
                        PF_MPS_TOTAL,
                        PF_EGGS,
                        PF_SOY,
                        PF_NUTSDS,
                        PF_LEGUMES,
                        PF_SEAFD_HI,
                        PF_SEAFD_LOW,
                        G_WHOLE,
                        D_TOTAL,
                        TFAT,
                        SFAT,
                        G_REFINED,
                        SODI,
                        ADD_SUGARS)

#Calculate HEI2015 from ASA24
data_HEI <- HEI2015_ASA24(Totals)

#The error happens when I run the HEI2015_ASA24() function

System info: OS: Windows 10, x64 R version 4.0.3

jamesjiadazhan commented 1 year ago

Thanks for reaching out!

The issue is exactly what you described. The HEI2015_ASA24 function cannot handle missing data. I don't think there is a way to improve it, because essentially you need all dietary variables to calculate the HEI2015 and it is not wise for imputing what the missing values would be. Thus, you have to clean up your data first before using the HEI2015_ASA24 funciton. You may use the na.omit() function (https://www.rdocumentation.org/packages/photobiology/versions/0.10.6/topics/na.omit) to remove all rows with any NA values. For example, you may follow the following methods:

clean_Totals = na.omit(Totals)

#Calculate HEI2015 from ASA24

data_HEI <- HEI2015_ASA24(clean_Totals)

This should work perfectly. Let me know if it doesn't.