ay-lab / FitHiChIP

Statistically Significant loops from HiChIP data
MIT License
39 stars 20 forks source link

Number of locus pairs with nonzero contact count is zero - FitHiChIP is quiting !!! #75

Closed poflawless closed 2 years ago

poflawless commented 2 years ago

Hello,

I have successfully run the test data and need to run on my own dataset.

Whereas after I provided ValidPairs and PeakFile it runs out an error that:

**_FitHiChIP_HiCPro_Pol2.sh : ligne 1130 : 30563 Process stop $RScriptExec ./src/InteractionHicPro.r $InpBinIntervalFile $InpMatrixFile $Interaction_Initial_File

======= Generated interaction file : /result/HiCPro_Matrix_BinSize1000/FitHiChIP.interactions.initial.bed cat: /result/HiCPro_MatrixBinSize1000/FitHiChIP.interactions.initial.bed: ==>>> Number of locus pairs with nonzero contact count (without any distance thresholding): 0 ** Number of locus pairs with nonzero contact count is zero - FitHiChIP is quiting !!!**

It seems that the program can't produce the initial.bed file so I listed my config parmaters below:

Interaction type - 1: peak to peak 2: peak to non peak 3: peak to all (default) 4: all to all 5: everything from 1 to 4. IntType=3

Size of the bins [default = 5000], in bases, for detecting the interactions. BINSIZE=1000

Lower distance threshold of interaction between two segments (default = 20000 or 20 Kb) LowDistThr=1000

Upper distance threshold of interaction between two segments (default = 2000000 or 2 Mb) UppDistThr=2000000

Applicable only for peak to all output interactions - values: 0 / 1 if 1, uses only peak to peak loops for background modeling - corresponds to FitHiChIP(S) if 0, uses both peak to peak and peak to nonpeak loops for background modeling - corresponds to FitHiChIP(L) UseP2PBackgrnd=0

parameter signifying the type of bias vector - values: 1 / 2 1: coverage bias regression 2: ICE bias regression BiasType=2

following parameter, if 1, means that merge filtering (corresponding to either FitHiChIP(L+M) or FitHiChIP(S+M)) depending on the background model, would be employed. Otherwise (if 0), no merge filtering is employed. Default: 1 MergeInt=1

FDR (q-value) threshold for loop significance QVALUE=0.05

prefix string of all the output files (Default = 'FitHiChIP'). PREFIX=FitHiChIP

Binary variable 1/0: if 1, overwrites any existing output file. otherwise (0), does not overwrite any output file. OverWrite=1

Thanks very much for any suggestions or advices

ay-lab commented 2 years ago

Hi @poflawless Thanks for your query. I see that the output file name is "/result/HiCPro_Matrix_BinSize1000/FitHiChIP.interactions.initial.bed". Is it a valid filename? Are you providing relative paths of the validpairs and peak files? In such a case, please provide the absolute paths of these files and the output directory, and re-run.

poflawless commented 2 years ago

Hello @ay-lab ,

Thanks for your repl, what I used is abosulte path and to easy searching, I just delete them in this post.

The ValidPairs file is the hicpro output and I cut the first 8 columns and peak is narrow peak result NB501040:310:HT7GVBGXC:3:13402:8557:1265 chr1 27 + chr5 9202814 - 114 NB501040:310:HT7GVBGXC:3:13510:16154:11544 chr1 27 + chr7 7793213 + 373 NB501040:345:HT7VKBGXJ:2:13105:8403:2668 chr1 27 + chr10 45455303 - 171 NB501040:310:HT7GVBGXC:2:13102:1590:7256 chr1 28 + chr3 28987071 - 170 NB501040:345:HT7VKBGXJ:3:23508:15600:6071 chr1 41 + chr1 64203046 + 158 NB501040:310:HT7GVBGXC:4:23610:18200:7384 chr1 41 + chr7 35579 + 469

Did I do sth wrong?

poflawless commented 2 years ago

Hi @poflawless Thanks for your query. I see that the output file name is "/result/HiCPro_Matrix_BinSize1000/FitHiChIP.interactions.initial.bed". Is it a valid filename? Are you providing relative paths of the validpairs and peak files? In such a case, please provide the absolute paths of these files and the output directory, and re-run.

Hello @ay-lab , I got result after I change the bin size from 1000 to 2500 bp, Could you explain me a little bit more about why the binsize is so critcial for the results?

So what's the standard for choosing the bin size? Do you have any suggestions? So some data that I need binsize smaller than 2.5kb, what should I do to make it happen?

Thanks for your reply

ay-lab commented 2 years ago

Hi @poflawless The high resolution (1 Kb) bin size only works if your data has a very high sequencing depth (~ 100M reads or even more). Otherwise, you've to increase the bin size, to get a sufficient number of contacts between individual pairs of bins. I'd suggest testing resolutions 2.5 Kb, 5 Kb, and 10 Kb with different FitHiChIP background models (loose or P2P=0 and stringent or P2P=1) to decide the resolution and the background model. In general, high resolution and stringent background models work for high sequencing depth.