ay-lab / FitHiChIP

Statistically Significant loops from HiChIP data
MIT License
39 stars 20 forks source link

Running FitHiChIP did not report any error, only print 'SORRY !!!!!!!! ……' #81

Closed Miracle-Yao closed 1 year ago

Miracle-Yao commented 2 years ago

Hi, I have a stupid question to ask you all. When I was running FitHiChIP, I found no errors reported in the entire log. Only have this,

SORRY !!!!!!!! FitHiChIP could not find any statistically significant interactions
Check the input parameters, or check if the number of input nonzero contact locus pairs are too few !!!
……
 so no WashU specific session file is created !!

I checked FitHiChIP_HiCPro.sh, but couldn't see any problems. When I checked the output *.interactions_FitHiC.bed file, I found that the value of Q-Value_Bias for each interaction was 0.673508697849507, which is much larger than the set 0.05.What is the reason for this? Here are the first 10 lines of this file

chr1    s1      e1      chr2    s2      e2      cc      Coverage1       isPeak1 Bias1   Mapp1   GCContent1      RESites1        Coverage2       isPeak2 Bias2   Mapp2   GCContent2      RESites2        Dist    p       exp_cc_Bias     p_Bias       dbinom_Bias     P-Value_Bias    Q-Value_Bias
chr1    3065000 3070000 chr1    3255000 3260000 1       23      0       1.33227210124986        0       0       0       12      1       0.568932903865512       0       0       0       190000  0.0000000749145799585738        1.01092311430721     0.00000285277359306708  0.367858172650478       0.636117605752353       0.673508697849507
chr1    3075000 3080000 chr1    3665000 3670000 1       16      1       0.758577205154016       0       0       0       13      0       0.753023361576008       0       0       0       590000  0.0000000252958390414761        1.00506513507252     0.00000283624267371923  0.36787525704318        0.633979728454977       0.673508697849507
chr1    3075000 3080000 chr1    3680000 3685000 1       16      1       0.758577205154016       0       0       0       19      0       1.10057260538032        0       0       0       605000  0.0000000241266311438299        1.00715047766413     0.00000284212740441108  0.367870600198408       0.634742212984412       0.673508697849507
chr1    3195000 3200000 chr1    3660000 3665000 1       17      0       0.984722857445549       0       0       0       18      1       0.853399355798268       0       0       0       465000  0.0000000328123981179091        1.00807467822359     0.00000284473545136678  0.367868031605972       0.635079629450677       0.673508697849507
chr1    3205000 3210000 chr1    3255000 3260000 1       16      0       0.926797983478164       0       0       0       12      1       0.568932903865512       0       0       0       50000   0.000000197804726960218 1.01661146866463    0.00000286882583964169   0.367829762143606       0.638181627650492       0.673508697849507
chr1    3320000 3325000 chr1    4050000 4055000 1       13      0       0.753023361576008       0       0       0       20      1       0.94822150644252        0       0       0       730000  0.0000000196293542917594        1.00524439815458     0.00000283674854501596  0.367874918847297       0.634045336682308       0.673508697849507
chr1    3355000 3360000 chr1    3705000 3710000 1       28      0       1.62189647108679        0       0       0       24      1       1.13786580773102        0       0       0       350000  0.0000000452210758282014        1.01550979824209     0.00000286571698176198  0.367836167510026       0.637782802261071       0.673508697849507
chr1    3380000 3385000 chr1    3660000 3665000 1       30      0       1.73774621902156        0       0       0       18      1       0.853399355798268       0       0       0       280000  0.0000000548403463266663        1.01814294788469     0.00000287314759607944  0.367820140554098       0.638735322464559       0.673508697849507
chr1    3395000 3400000 chr1    3920000 3925000 1       18      0       1.04264773141293        0       0       0       18      1       0.853399355798268       0       0       0       525000  0.0000000287850700574849        1.00805016830402     0.00000284466628562083  0.367868103718942       0.635070685146692       0.673508697849507
chr1    3420000 3425000 chr1    3450000 3455000 1       21      0       1.21642235331509        0       0       0       19      1       0.900810431120394       0       0       0       30000   0.000000287779052692279 1.0488539033899 0.00000295981234994964       0.367454988414394       0.649661502303642       0.673508697849507
chr1    3435000 3440000 chr1    3830000 3835000 1       18      0       1.04264773141293        0       0       0       14      1       0.663755054509764       0       0       0       395000  0.0000000389865070252085        1.00618536694617     0.00000283940391106956  0.367872951856357       0.634389527614606       0.673508697849507
chr1    3450000 3455000 chr1    3600000 3605000 1       19      1       0.900810431120394       0       0       0       24      0       1.39019697521725        0       0       0       150000  0.0000000907225705211327        1.02112150934193     0.00000288155294496332  0.367799047416525       0.639809773646697       0.673508697849507
chr1    3450000 3455000 chr1    3640000 3645000 1       19      1       0.900810431120394       0       0       0       22      0       1.27434722728248        0       0       0       190000  0.0000000749145799585738        1.01584676996146     0.00000286666789880904  0.367834254179031       0.637904839000142       0.673508697849507
chr1    3450000 3455000 chr1    3650000 3655000 1       19      1       0.900810431120394       0       0       0       18      0       1.04264773141293        0       0       0       200000  0.0000000723902775931431        1.0127485596292      0.00000285792490688754  0.367850318037217       0.636781249154961       0.673508697849507
chr1    3450000 3455000 chr1    3790000 3795000 1       19      1       0.900810431120394       0       0       0       21      0       1.21642235331509        0       0       0       340000  0.0000000471283093045569        1.01171271010644     0.00000285500179223805  0.367854922067694       0.636404813178109       0.673508697849507
chr1    3470000 3475000 chr1    3830000 3835000 1       10      0       0.579248739673853       0       0       0       14      1       0.663755054509764       0       0       0       360000  0.0000000426479825679182        1.00214753092722     0.00000282800934326815  0.367879113141187       0.632910263785966       0.673508697849507
chr1    3490000 3495000 chr1    4270000 4275000 1       13      0       0.753023361576008       0       0       0       7       1       0.331877527254882       0       0       0       780000  0.0000000175240962694153        1.00239265098731     0.00000282870105960609  0.367878908900329       0.633000234071422       0.673508697849507
chr1    3505000 3510000 chr1    3995000 4000000 1       20      0       1.15849747934771        0       0       0       21      1       0.995632581764646       0       0       0       490000  0.000000031320184639484 1.01025145192013    0.00000285087819598472   0.367860761122005       0.635873116839761       0.673508697849507
chr1    3590000 3595000 chr1    3705000 3710000 1       17      0       0.984722857445549       0       0       0       24      1       1.13786580773102        0       0       0       115000  0.000000111062528286944 1.02215274777396    0.00000288446304734938   0.367791014607298       0.640181025263908       0.673508697849507

Can someone help me with this problem?

ay-lab commented 2 years ago

Hi @Datacond Thanks for your query. Seeing the first few lines, the contact count is 1, and the distance is <= 200 Kb. Of course, these contacts cannot be significant. In general, if FitHiChIP is not reporting any significant interactions, it means either the sequencing depth is low or the contacts are randomly distributed (lower quality of library). Please check the following: 1) Sequencing depth of your library 2) Number and the fraction of CIS reads, the fraction of duplicate reads, number and fraction of CIS reads > 10 Kb. 2) First few lines of this file with decreasing order of cc (basically the highest values of cc and the corresponding loop description) 3) Whether you have executed the P2P=0 (loose) or P2P=1 (stringent) model. Did you use coverage bias regression?

Miracle-Yao commented 2 years ago

@ay-lab Thank you very much for your help. In my data ,the Cis reads of HiCPro output is 3,470,043. Cis longRange(>20kb,HiCPro default parameter) is 2,955,992. The QC picture of HiCPro show that Cis-long interactions(>20kb) accounted for 15% of valid 3C pairs. Duplicates accounted for ~75% of valid 3C pairs. My Attention is transcription factor,so I use default interaction type(peak to all).Also,I use P2P=0 and coverage bias regression. This is the part of the parameters that I set,

CircularGenome=0
IntType=3
BINSIZE=5000
LowDistThr=20000
UppDistThr=2000000
UseP2PBackgrnd=0
QVALUE=0.05

I wonder if increasing the BinSize will find significant chromatin loop.

ay-lab commented 2 years ago

Hi @Datacond Thanks for your reply. You can try with higher bin sizes (10 Kb or 20 Kb). Also, could you please dump the first few lines of the output file *fithic.bed (i.e. FitHiChIP output without any FDR-based filtering) after sorting them by decreasing order of contact count (7th column)? You can email me as well. I want to look into the distribution of contact counts across individual interactions.

Miracle-Yao commented 2 years ago

Hi@ay-lab Thanks for your reply very much.

Under these parameters, in fact, no contact count exceeds 2 in the 7th column of this file. I think it may be related to the sequencing depth. At least, the number of reads I inputed into HiCPro was too small(~60 million pairs), which resulted in no significant loop even when I increased BinSize to 20kb. If you don't mind, could you please help me evaluate a few other problems related to FitHiChIP?

  1. I have tried many methods to call significant chromatin loop and when I looked for peaks from HiChIP data using FitHiChIP hicpro.sh, I did not find any peaks.This may also be influenced by the sequencing depth. Crucially, HiChIP data(form NCBI) with sufficient sequencing depth found only a few hundred peaks(~500), which is far from the ~ 40,000 peaks obtained by ChIP-seq alone. In other words, the classical single ChIP peak combined with HiChIP data can call many loops, while the peak combined with HiChIP data can call hundreds of loops(~400). What might have caused this?
  2. Using FitHiChIP hicpro.sh call peak from HiChIP data, what effect does parameter L(ReadLength) have on peak calling?
  3. We only choose 'peak to all' parameter.Why does the output directory have FitHiChIP_ALL2ALL_b20000_L20000_U2000000 directory.

Thanks again.

ay-lab commented 2 years ago

Hi @Datacond 1) As the number of ChIP-seq peaks is much higher compared to HiChIP peaks, and because FitHiChIP returns the interactions involving peaks in at least one end (peak-to-all mode), using ChIP-seq generated a lot more HiChIP loops. We advise using ChIP-seq peaks if they are available. 2) -L means read length used in the sequencing data. In the valid pairs file, we are having coordinates from either end, so to estimate the span of these reads in individual ends, we need the read length values. For example, a paired-end CIS read with coordinates (chr1, X, Y) means that there are two reads: one read spans X +/- L (depending on the strand; L = read length) while the other read spans Y +/- L. These two reads are used to estimate 1D peaks. 3) The ALL2ALL directory stores the complete set of interactions (may involve peaks or not) subject to the distance thresholds for reference. 3)