GreenwoodLab / funtooNorm

6 stars 7 forks source link

Gender inference using chromosome Y incorrect #46

Open DanielEWeeks opened 2 years ago

DanielEWeeks commented 2 years ago

We recently ran a data set consisting of all women through funtooNorm and it incorrectly told us they were all males. This is because the gender-inference code based on chromosome Y methylation levels is set up backwards.

In Wang et al (2021), we see that the gender-inference should be based on this approach:

"The Y chromosome: the identified sex-associated CpG sites of males are highly methylated with beta values greater than 0.6 whereas females exhibited low methylation signals"

Wang Y, Hannon E, Grant OA, Gorrie-Stone TJ, Kumari M, Mill J, Zhai X, McDonald-Maier KD, Schalkwyk LC. DNA methylation-based sex classifier to predict sex and identify sex chromosome aneuploidy. BMC Genomics. 2021 Jun 28;22(1):484. DOI: https://doi.org/10.1186/s12864-021-07675-2

However, the funtooNorm code is identifying as males those with median chromosome Y beta values less than 0.6 (Note also that 'sex' needs to be coded as 0 and 1 or 0=FALSE=female; 1=TRUE = male):

              ###### this part deal with chrY
              if(is.null(sex)){
                  mens=matrixStats::colMedians(calcBeta(object@signal$AchrY,
                                                    object@signal$BchrY))<0.6                    <= FIX: >=0.6
                  message("we found ",sum(mens)," men and ",sum(!mens),
                          " women in your data set base on Y probes only")
                  }else{
                      mens=sex
                      message("There is ",sum(mens)," men and ",
                              sum(!mens)," women")
                      }
              # no correction for women
              object@predmat$AchrY=object@signal$AchrY
              object@predmat$BchrY=object@signal$BchrY
              if(1<sum(mens)){                                                                   <= FIX: 1<=sum(mens)
                  object@predmat$AchrY[,mens]=
                      quantileNormalization(object@signal$AchrY[,mens])
                  object@predmat$BchrY[,mens]=
                      quantileNormalization(object@signal$BchrY[,mens])
                  }