ay-lab / dcHiC

dcHiC: Differential compartment analysis for Hi-C datasets
MIT License
55 stars 10 forks source link

Visualization #65

Closed Lucas446 closed 1 year ago

Lucas446 commented 1 year ago

Hi,

If I want to plot the A/B compartment as matrices which output file would you use ?

Also could explain the role of the following files, and the ones that can be used to visualize the results: chrX.bed chrX.cmat.txt chrX.distparam chrX.pc.bedGraph chrX.pc.txt chrX.PC1.bedGraph chrX.PC2.bedGraph chrX.precmat.txt chrX.svd.rds

Best,

jeffreygwang commented 1 year ago

What do you mean by matrices? You can use -viz to get tracks that look like this. This uses a file created from -analyze after differential calling is run.

If you want to visualize compartments on an axis with the underlying Hi-C matrix (e.g. Fig A here), you could use Juicebox, upload the .hic file, and add PC1.bedGraph (or PC2 if e.g. PC1 captures a chromosome arm).

Re: which file to use from the options you listed, by default, dcHiC tries to output just about anything you might want from the intermediate levels; for instance, .svd.rds is the R binary dump of the SVD. The PC files are the ones you'll want for viz: .pc.txt and .pc.bedGraph have results for all PC's, and .PC1.bedGraph and .PC2.bedGraph just contain data for the specified PC. Use whichever format works best for your purposes!

Lucas446 commented 1 year ago

I see, thanks for the explanations.

Concerning the matrices, I was wondering if there is a way to extract the HiC contact map of a sample after distance normalization step ? In order to extract normalized contact value at various position stored in a .bed file for example ?

best,

Lucas446 commented 1 year ago

Also, concerning PC1 and 2, I realized that dcHiC is taking PC1, which clearly capture chr arms, for differential analysis, is there a way to force it to take PC2 ?

jeffreygwang commented 1 year ago

Sorry about the late response - I don't think dcHiC currently outputs that map because it's so memory-expensive, but plenty of other tools (e.g. HOMER) do this!

dcHiC doesn't take PC1 by default; it tries to intelligently select the "best" PC, but sometime it gets this wrong. You can see PC assignments in an output text file—and re-select by using a utility (see this wiki page).

Lucas446 commented 1 year ago

Hi,

Concerning the PC1 and 2, here is the result of compartment if PC1 or PC2 are selected: PC1:

Screen Shot 2023-04-10 at 1 24 10 PM

PC2:

Screen Shot 2023-04-10 at 1 24 16 PM

Do think that it is more accurate to force select PC2 is that case ?

Thanks a lot, Best,

jeffreygwang commented 1 year ago

Hmmm... these are really poor-looking PC's. I'm unsure how interpretable these results are. I'm only familiar with mice/human data from my own experimentation and it seems like this a different genome; is this more or less what you expected to see for compartments? Or were you looking for something with less noise?

I might try looking at other PC's (3/4) or finding different data.

ay-lab commented 1 year ago

Also, is it only for this chromosome that you're facing the issue, or is it genome-wide?

Lucas446 commented 1 year ago

Hi Jeffrey,

Thanks for your advice. All the chromosomes are like that. I am processing a different data from Drosophila (180Mb genome) to see if my data are really noisy. I did HiC on 10000 cells that might explain the lack of clear information.

How do I look for PC3 and PC4, do I have to modify the pc_k = 3 line 42 of dcHIC.r ?

Best,

abhijitcbio commented 1 year ago

Please set the --pc option to 4 during the pca step. It will generate PC's up to 4. Also, what is the Hi-C resolution you're using to analyze the compartments?

Lucas446 commented 1 year ago

Ok I will try to look at the other PCA. The resolution I am using is 20kb. (HiC from 10'000 cells, dpnII, biotin step, around 4 millions valid pairs per replicates)

Lucas446 commented 1 year ago

Here are the PC1,2,3,4 for chr2L. I feel that PC2 is the best here, what do you think ?

Screen Shot 2023-04-12 at 4 37 58 PM

If I change resolution 20kb to 50kb, I get less noise, as expected: 2 first rows are 20kb for 2 time points, 2 last rows are same time point at 50kb resolution

Screen Shot 2023-04-12 at 4 32 25 PM
ay-lab commented 1 year ago

Yes, the PC2 is way better. In any case, using a low resolution always improves the compartment calls.