johnfergusonNUIG / graphPAF

Other
2 stars 2 forks source link

Can not calculate confidence interval using PAF_calc_discrete #5

Open hmeeks0212 opened 1 month ago

hmeeks0212 commented 1 month ago

I'm trying to use PAF_calc_discrete to estimate the proportion of cancer patients survived 5 years after first diagnosis given that no one has diabetes.

Here is the code that i used:

paf_1_q1 <- graphPAF::PAF_calc_discrete( model = t1, data = clean_analysis %>% dplyr::filter(censor2021.lt18 == 0), riskfactor = "t2dm", refval = "0", calculation_method = "D", ci = TRUE, boot_rep = 50, ci_type = c("norm"), t_vector = c(5), verbose = TRUE )

I kept receiving the following errors:

Error in if (const(t, min(1e-08, mean(t, na.rm = TRUE)/1e+06))) { : missing value where TRUE/FALSE needed

However, if I don't ask for CI, then the following codes work: paf_1_q1 <- graphPAF::PAF_calc_discrete( model = t1, data = clean_analysis %>% dplyr::filter(censor2021.lt18 == 0), riskfactor = "t2dm", refval = "0", calculation_method = "D", t_vector = c(5) )

Can you please let me know what went wrong?

Thank you.

johnfergusonNUIG commented 1 month ago

It looks like this error is generated from the boot package that is used in calculating CIs- see:

https://stackoverflow.com/questions/19929488/error-in-bootstrapping-error-in-if-constt-min1e-08-meant-na-rm-true-1

for a similar error. Are there any NAs in the relevant variables in the dataset being used?

hmeeks0212 commented 1 month ago

Hi John,

Thank you so much for the prompt response.

No there isn't any NAs in the relevant variables in the dataset being used. I noticed that if the exposure is a factor variable, then CIs can't be estimated; but if I changed the exposure to be a character variable, then CIs can be estimated.

Another thing that I noticed is the estimated PAF is different between using PAF_calc_discrete and AF::AFcoxph. For example, one of the exposures of interest is education (less than HS, HS graduates, and more then HS). If I use PAF_calc_discrete with RefVal = "Less than HS", I got PAF = 0.52. But if I use AF::AFcoxph (and have to convert education into a binary variable: 1 = Less than HS, 0 = otherwise), I got AF = 0.023. Here is the codes that I used:

graphpaf_paf <- graphPAF::PAF_calc_discrete( model = coxph(Surv(fuptime2021, censor2021.lt18) ~ male + byr + white + hispanic + education, data = t), riskfactor = "education", refval = "Less than HS", calculation_method = "D", t_vector = c(1, 18), ci = TRUE, boot_rep = 50, ci_type = c("norm"), verbose = FALSE

af_afcoxph <- AF::AFcoxph( coxph(Surv(fuptime2021, censor2021.lt18) ~ male + byr + white + hispanic + education.2lev, data = t, ties = "breslow"), data = t, exposure = "education.2lev", times = c(1, 18)

Thank you so much for your help.

johnfergusonNUIG commented 1 month ago

Hi there,

That's interesting - thanks for letting me know.

Regarding the discrepancy between AF and calc_PAF_discrete, there may be 2 problems. First refval in PAF_calc_discrete should be the value of the exposure that minimises risk (probably 'more than HS' here) here. However, even if refval='more than HS' there still will be a difference between PAF_calc_discrete and AF for a multilevel exposure, since graphPAF will account for differences in risk levels between 'HS graduates vs more than HS' and 'Less than HS vs More than HS', whereas these differences will be pooled if the comparison is 'Less than HS vs everyone else'