Closed fruce-ki closed 6 years ago
Tried with simulated data where the replicates within each condition are identical and therefore subsumpling should yield 100% reproducible results.
rep_dtu_freq
is still coming up as 0 everywhere. It seems to be a real code bug.
I found a typo that caused this type of bootstrapping to compare a condition with itself. This explains the 0% DTU frequency even for abundances designed to show up as DTU.
This bug goes back all the way to 0.4.0
. It only affects results for which rboot
is set to TRUE
. In most versions of RATs, qboot was FALSE
by default. In the latest versions however, this was TRUE
by default.
The bug has evaded detection for so long, because until 0.6.0
and the introduction of scaling, I attributed the poor detection of DTU by the cross-replicate bootstrapping process to low statistical power from comparing single samples. With scaling added, this could no longer be explained. But still, often there is high variability in our data and usually those are the genes I look at first, so again this error did not stand out. Yesterday for the first time, I noticed some very blatant examples among very highly consistent samples that were impossible to explain otherwise.
This reproducibility is coming out as 0% much more than expected, even in genes where the replicates have nearly identical values and 100% reproducibility in qrep_dtu_freq. It is not clear if this is a code bug or a real product of correct behaviour. This must be investigated and resolved.