Closed Lucas446 closed 1 year ago
What do you mean by matrices? You can use -viz
to get tracks that look like this. This uses a file created from -analyze
after differential calling is run.
If you want to visualize compartments on an axis with the underlying Hi-C matrix (e.g. Fig A here), you could use Juicebox, upload the .hic file, and add PC1.bedGraph (or PC2 if e.g. PC1 captures a chromosome arm).
Re: which file to use from the options you listed, by default, dcHiC tries to output just about anything you might want from the intermediate levels; for instance, .svd.rds is the R binary dump of the SVD. The PC files are the ones you'll want for viz: .pc.txt and .pc.bedGraph have results for all PC's, and .PC1.bedGraph and .PC2.bedGraph just contain data for the specified PC. Use whichever format works best for your purposes!
I see, thanks for the explanations.
Concerning the matrices, I was wondering if there is a way to extract the HiC contact map of a sample after distance normalization step ? In order to extract normalized contact value at various position stored in a .bed file for example ?
best,
Also, concerning PC1 and 2, I realized that dcHiC is taking PC1, which clearly capture chr arms, for differential analysis, is there a way to force it to take PC2 ?
Sorry about the late response - I don't think dcHiC currently outputs that map because it's so memory-expensive, but plenty of other tools (e.g. HOMER) do this!
dcHiC doesn't take PC1 by default; it tries to intelligently select the "best" PC, but sometime it gets this wrong. You can see PC assignments in an output text file—and re-select by using a utility (see this wiki page).
Hi,
Concerning the PC1 and 2, here is the result of compartment if PC1 or PC2 are selected: PC1:
PC2:
Do think that it is more accurate to force select PC2 is that case ?
Thanks a lot, Best,
Hmmm... these are really poor-looking PC's. I'm unsure how interpretable these results are. I'm only familiar with mice/human data from my own experimentation and it seems like this a different genome; is this more or less what you expected to see for compartments? Or were you looking for something with less noise?
I might try looking at other PC's (3/4) or finding different data.
Also, is it only for this chromosome that you're facing the issue, or is it genome-wide?
Hi Jeffrey,
Thanks for your advice. All the chromosomes are like that. I am processing a different data from Drosophila (180Mb genome) to see if my data are really noisy. I did HiC on 10000 cells that might explain the lack of clear information.
How do I look for PC3 and PC4, do I have to modify the pc_k = 3
line 42 of dcHIC.r ?
Best,
Please set the --pc
option to 4 during the pca
step. It will generate PC's up to 4.
Also, what is the Hi-C resolution you're using to analyze the compartments?
Ok I will try to look at the other PCA. The resolution I am using is 20kb. (HiC from 10'000 cells, dpnII, biotin step, around 4 millions valid pairs per replicates)
Here are the PC1,2,3,4 for chr2L. I feel that PC2 is the best here, what do you think ?
If I change resolution 20kb to 50kb, I get less noise, as expected: 2 first rows are 20kb for 2 time points, 2 last rows are same time point at 50kb resolution
Yes, the PC2 is way better. In any case, using a low resolution always improves the compartment calls.
Hi,
If I want to plot the A/B compartment as matrices which output file would you use ?
Also could explain the role of the following files, and the ones that can be used to visualize the results: chrX.bed chrX.cmat.txt chrX.distparam chrX.pc.bedGraph chrX.pc.txt chrX.PC1.bedGraph chrX.PC2.bedGraph chrX.precmat.txt chrX.svd.rds
Best,