Closed kalavattam closed 1 year ago
Hi, thank you for your message.
-u
parameter.-S
would be recommended); last time I checked, Juicer restored the original number of contacts (so scaling would be necessary); I don't know what Cooler does, however - you would need to consult their docs.mappability
vector, and coloured grey in the matrix. In Juicer we rely on NaN
entries in the bias vector to do the same, but we have not implemented anything similar for Cooler files. What does the bias vector look like in those regions? Is it NaN
or 0
? Or does it contain a positive, finite value?Thank you for taking the time to address my questions. In light of your explanations and the program's detailed documentation, I opted to convert my raw .cool
files to FAN-C .hic
files. Working with .hic
files, in this case, seems to offer a more straightforward path to achieving my objectives.
For point 3 above, I have sought clarification here; for point 2 above, here. Thank you again—closing the issue now.
Response from Nezar to point 2:
By default, cooler rescales the target matrix (whole genome by default, or each chromosome for cis-only balancing) to make the marginal sums = 1.
This can be turned off with the rescale_marginals option in balance_cooler; however, we also store the original scaling factor in the metadata attributes of the weight vector:
with clr.open("r") as f: scale = f["bins/weight"].attrs["scale"]
This scaling factor corresponds to the marginal sum of the target matrix at the end of balancing which is roughly corresponds to its average read coverage. If desired, you can restore this scale factor by multiplying a balanced matrix by scale or equivalently by multiplying the balancing weight vector by sqrt(scale). For a log2-ratio though, you don't want your contact frequencies to be proportional to coverage.
Response from Nezar to point 3:
Regions with 0 contacts are normally masked out for matrix balancing (the algorithm would never converge if they were kept). Masked/filtered bins are normally encoded as NaN in the weight vector, which will "NaN-out" the corresponding row/column of the matrix when the weights are applied (B_ij = w_i w_j A_ij).
Hi @kaukrise,
Thanks again for this great program. I’ve been using
fanc compare
to generate log2-transformed comparison values with the following command:I have a few questions I hope you can clarify:
fanc compare
automatically apply weights?fanc compare
? Does this requirement also apply to ICE-balanced matrices?fancplot
, is there a way to exclude regions with zero pairs from being colored by the colormap? In the attached PDF, it appears that regions without pairs are assigned a value of 0.log2_Q-over-G2_6400_XII-1-800000.scale.pdf
I appreciate your time and assistance. Thanks, Kris