ay-lab / FitHiChIP

Statistically Significant loops from HiChIP data
MIT License
39 stars 20 forks source link

How to get common loops between two samples #101

Open remeseirogrp opened 12 months ago

remeseirogrp commented 12 months ago

Hello, I have a question regarding categorization of loops. So I have used FithiChIP for callong loops for two samples of CTCF-HiChIPs (L, peak to all, FDR 0.01) and I used the DiffAnalysisHiChIP.r to call for differential loops. My question is how to get common loops between two samples? I want to determine, total number of loops lost and gained from one sample to other and also get a list of significant loops that stays unchanged in both samples.

Best, Chaitali Chakraborty, PhD, Postdoc, group Remeseiro, UCMM, Umeå University, Sweden

souryacs commented 12 months ago

Hi Chaitali, Thanks for your message. To compute the overlap between two sets of loops, do either of the following: 1) You can check our utility script which can compute overlap of at most 5 different sets of loops. Using the "offset" parameter, you can even provide a slack between the interacting bins for overlap analysis. For example, offset = 0 means exact overlap, where two loops should have exactly overlapping bins. On the other hand, offset > 0 indicates that two loops having their interacting bins within "offset" bp would also be considered as overlapping. 2) You can employ "bedtools pairtopair" routine as well to compute exact overlap between two sets of loops.

remeseirogrp commented 12 months ago

Dear Dr. Bhattachraya, Thank you the suggestions. Will try them out.

remeseirogrp commented 11 months ago

Thank you for the suggestion. I got the common loops, setting Offset 0 to get common loops between 2 conditions for CTCF-HiChIP. I did so wityh the idea that CTCF peaks in ChIp are sharp and defined and do not spread in the genome as much as histones. This was my logic. With this I have a total of 11% common loops. Do you think the criteria is too stringent?

remeseirogrp commented 11 months ago

Hi I want to mention a samll anomaly about the loop overlap, the input for labels are not assigned correctly to the loop samples. So my code was this-

Rscript /proj/projfolder/nobackup/Utilities/Loop_Overlap_Venn/Venn_Interactions.r --FileList $SIGLOOPLIST/Control.interactions_FitHiC_Q0.01_MergeNearContacts.bed:$SIGLOOPLIST/Treat.interactions_FitHiC_Q0.01_MergeNearContacts.bed --Labels Control,Treat --offset 0 --OutDir $OLOUTPUT/SigOnly_LoopOverlap When I get the output I get Control unique loops:21000, common: 300, Treat unique loops: 19000. I take this information to R to prepare browser files, what I see is that the function has swapped the labels on the .png file, so in reality it is the Treat unique loops are 21000 and control unique loops are 19000.

Did I put the labels incorrectly, or is this an intrinsic thing?