Difficulty Extracting Telomere Length Data from qmotif XML Output

AdamaJava / adamajava

Other

15 stars 5 forks source link

We don’t normalise directly to genome coverage. Rather, we simply scale to a nominal read count of 1B reads to allow for simple comparisons between BAMs with different numbers of reads. So if your BAM has 0.5B reads, all of the scaled scores will be double the raw counts and if your BAM has 2B reads, the scaled scores would be half of the raw numbers. We don’t take any account of unmapped reads, secondary alignments etc when scaling, we just count every read. We take this simple approach because when you are talking about tumours, the correct approach is non-obvious - for example, if we have 3 chromosomes with whole-arm amplifications, how should we take account of that? Clever/correct scaling is left as an exercise for the user as they know their data best. With all of those caveats, qMotif scaled scores correlate very well with wet-lab techniques as we showed in the qMotif paper so we think the simple scaling approach probably works well enough in the majority of cases.

AdamaJava / adamajava

Difficulty Extracting Telomere Length Data from qmotif XML Output #351