liusihan / seGMM

A new tool to infer sex from massively parallel sequencing data.
MIT License
13 stars 2 forks source link

Mistake in code that classifies XXX #11

Open madleina opened 9 months ago

madleina commented 9 months ago

Hey

I think there is a mistake in the code that classifies XXX samples (seGMM.r, line 106). The first statement is

feature[i,"Xmap"]>(2*mean_f_xmap)

which means that you expect twice as many reads on the X chromosome in XXX compared to females (XX). I think it should rather be 1.5 (XX -> XXX is a factor 1.5). Do you agree with this?

liusihan commented 9 months ago

As the X chromosome mapping rate can be affected by a variety of factors, we strictly require that the X chromosome mapping rate for XXX be twice that for females (XX). Using a factor of 1.5 is also a viable alternative. The optimal threshold may require additional evaluation.