Closed Miracle-Yao closed 1 year ago
Hi @Datacond Thanks for your query. Seeing the first few lines, the contact count is 1, and the distance is <= 200 Kb. Of course, these contacts cannot be significant. In general, if FitHiChIP is not reporting any significant interactions, it means either the sequencing depth is low or the contacts are randomly distributed (lower quality of library). Please check the following: 1) Sequencing depth of your library 2) Number and the fraction of CIS reads, the fraction of duplicate reads, number and fraction of CIS reads > 10 Kb. 2) First few lines of this file with decreasing order of cc (basically the highest values of cc and the corresponding loop description) 3) Whether you have executed the P2P=0 (loose) or P2P=1 (stringent) model. Did you use coverage bias regression?
@ay-lab Thank you very much for your help. In my data ,the Cis reads of HiCPro output is 3,470,043. Cis longRange(>20kb,HiCPro default parameter) is 2,955,992. The QC picture of HiCPro show that Cis-long interactions(>20kb) accounted for 15% of valid 3C pairs. Duplicates accounted for ~75% of valid 3C pairs. My Attention is transcription factor,so I use default interaction type(peak to all).Also,I use P2P=0 and coverage bias regression. This is the part of the parameters that I set,
CircularGenome=0
IntType=3
BINSIZE=5000
LowDistThr=20000
UppDistThr=2000000
UseP2PBackgrnd=0
QVALUE=0.05
I wonder if increasing the BinSize will find significant chromatin loop.
Hi @Datacond Thanks for your reply. You can try with higher bin sizes (10 Kb or 20 Kb). Also, could you please dump the first few lines of the output file *fithic.bed (i.e. FitHiChIP output without any FDR-based filtering) after sorting them by decreasing order of contact count (7th column)? You can email me as well. I want to look into the distribution of contact counts across individual interactions.
Hi@ay-lab Thanks for your reply very much.
Under these parameters, in fact, no contact count exceeds 2 in the 7th column of this file. I think it may be related to the sequencing depth. At least, the number of reads I inputed into HiCPro was too small(~60 million pairs), which resulted in no significant loop even when I increased BinSize to 20kb. If you don't mind, could you please help me evaluate a few other problems related to FitHiChIP?
Thanks again.
Hi @Datacond 1) As the number of ChIP-seq peaks is much higher compared to HiChIP peaks, and because FitHiChIP returns the interactions involving peaks in at least one end (peak-to-all mode), using ChIP-seq generated a lot more HiChIP loops. We advise using ChIP-seq peaks if they are available. 2) -L means read length used in the sequencing data. In the valid pairs file, we are having coordinates from either end, so to estimate the span of these reads in individual ends, we need the read length values. For example, a paired-end CIS read with coordinates (chr1, X, Y) means that there are two reads: one read spans X +/- L (depending on the strand; L = read length) while the other read spans Y +/- L. These two reads are used to estimate 1D peaks. 3) The ALL2ALL directory stores the complete set of interactions (may involve peaks or not) subject to the distance thresholds for reference. 3)
Hi, I have a stupid question to ask you all. When I was running FitHiChIP, I found no errors reported in the entire log. Only have this,
I checked FitHiChIP_HiCPro.sh, but couldn't see any problems. When I checked the output *.interactions_FitHiC.bed file, I found that the value of Q-Value_Bias for each interaction was 0.673508697849507, which is much larger than the set 0.05.What is the reason for this? Here are the first 10 lines of this file
Can someone help me with this problem?