When calculating the fraction of pairs with long-range separation (d > 1kb, 5kb, 10kb), we must ensure that the denominator is consistent with the numerator.
Originally, we only considered intra-contig pairs, however we're now using a "greedy" method which also estimates separation for inter-contig pairs which meet a certain constraint. That constraint is that the location of one read of the pair must account for the entire separation represented by the bin. In effect, when estimating inter-contig separation, each contig has shoulder regions (of the bin size) which we ignore.
The count of pairs which becomes the denominator should also meet this constraint, not just "all pairs which map".
As it stands, our fractions will be slightly lower as the denominator is "all pairs which mapped".
When calculating the fraction of pairs with long-range separation (d > 1kb, 5kb, 10kb), we must ensure that the denominator is consistent with the numerator.
Originally, we only considered intra-contig pairs, however we're now using a "greedy" method which also estimates separation for inter-contig pairs which meet a certain constraint. That constraint is that the location of one read of the pair must account for the entire separation represented by the bin. In effect, when estimating inter-contig separation, each contig has shoulder regions (of the bin size) which we ignore.
The count of pairs which becomes the denominator should also meet this constraint, not just "all pairs which map".
As it stands, our fractions will be slightly lower as the denominator is "all pairs which mapped".