isubirana / compareGroups

R package to easily build publication-ready univariate or bivariate descriptive tables from a data set.
https://isubirana.github.io/compareGroups/
33 stars 14 forks source link

The p for trend #32

Open 2018zhangrrrr opened 1 year ago

2018zhangrrrr commented 1 year ago

Your R package is extremely efficient in handling between-group comparisons, and I truly appreciate your work. However, I recently encountered a minor issue that I'm not sure how to resolve. When I tried to calculate the p for trend in a binary logistic regression, I added show.p.trend = TRUE in the descrTable function. However, the result is not consistent with my manual calculation, and the p for trend value given by the R package is the same as the p.overall value. Could you please guide me on how to resolve this issue? For my manual calculation, I simply converted the factor variable to a numeric variable and built a univariate logistic regression model without adjusting for other variables.

isubirana commented 3 weeks ago

Thanks for you kind words.

When computing p-value for trends (show.p.trend=TRUE) in compareGroups, the p-value is computed in the same way as in SPSS for two cathegorical variables linear-by-linear test:

1-pchisq(cor(as.integer(x),as.integer(y))^2*(length(x)-1),1)

where x is a cathegorical variable and y is the variable indicating the groups

This p-value for trend generally does not give the same results as the overall p-value. Following, there is an example using regicor data available in the compareGroups package, where p-value for trend in proportion of women is tested along recruitment year:

> descrTable(year ~ sex, regicor, show.p.trend=TRUE)

--------Summary descriptives table by 'Recruitment year'---------

________________________________________________________________ 
              1995        2000        2005     p.overall p.trend 
              N=431       N=786      N=1077                      
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 
Sex:                                             0.506    0.544  
    Male   206 (47.8%) 390 (49.6%) 505 (46.9%)                   
    Female 225 (52.2%) 396 (50.4%) 572 (53.1%)                   
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 

Note that, in this example, the overall pvalue (0.506) is different from the p-value for trend (0.544).

Regards,

Isaac.