nanoporetech / pore-c

Pore-C support
Mozilla Public License 2.0
52 stars 5 forks source link

Down-weighting pairs like SPRITE #50

Open jessakay opened 2 years ago

jessakay commented 2 years ago

Thanks for sharing the amazing tool that greatly simplifies Pore-C analysis! Would it be possible to add an option in the cooler export module to down-weight the pairwise interactions by the number of fragments per read, as was described in Quinodoz 2018? Specifically, from their methods section on "Generating pairwise contacts and heatmaps from SPRITE Data":

Because the number of pairwise contacts scales quadratically based on the number of reads (n) contained within a SPRITE cluster, larger clusters will contribute a disproportionally large number of the contacts observed between any two bins. To account for this, we reasoned that a minimally connected graph containing n reads would contain n-1 contacts. Therefore, we down-weighted each of the n(n-1)/2 pairwise contacts in a SPRITE cluster such that each pairwise contact has a weight of 2/n. In this way, the total contribution of pairwise contacts from a cluster is proportional to the minimally connected edges in the graph. This also ensures that the number of pairwise contacts contributed by a cluster is linearly proportional to the number of reads within a cluster.

Would you also kindly offer your thoughts on whether this down-weighting scheme seem suitable for Pore-C as it does for SPRITE? And how about the recently proposed normalized pointwise mutual information (NPMI) values from Winick-Ng 2021?