yufanzhouonline / HiSIF

HiSIF: Genome-wide chromatin interactions identify characteristic promoter-distal loops
MIT License
1 stars 0 forks source link

Pair_Frags...read error with HiSIF command #2

Open yaojiayingJenny opened 3 years ago

yaojiayingJenny commented 3 years ago

I got an error with HiSIF step. It seems my input files aren't in correct format. But I followed the pipeline tutorial instructions strictly. Could anyone know the reason? Thanks.

$head test/chr1.tmp 1 10000001 1 1 10001638 1
1 10000018 1 1 9996231 1
1 10000020 1 1 9383708 1
1 10000021 1 1 9564208 1
1 100000267 0 1 103050925 0

command: HiSIF -g hg19_bowtie2_index -c hg19.MboI.bed -p 1 29 -w 50 500 5000 -T 1 -s 0.1 -i 2 -m 1 -x 5 -o test/result test/

out.log: (=:...........Start processing files...........:=) cuttingSiteTotal == 7127585 <-----Parsed enzyme cutting site map-----> <-----Extended cutting site region-----> <-----Combining Data from Child Processes-----> <-----Reading Vector Sizes for Bootstrapping-----> <-----Found 25 files-----> <-----Reading sum pipe-----> <-----Performing filtration-----> <-----Writing to test__PoisMix.txt-----> <-----Main Process 1 finished writing distrubitions-----> <-----Finished Vector Sizes for Bootstrapping-----> <-----Child Processes Finished-----> <-----Beginning Bootstrapping-----> <-----Using samplesize of 13555475 elements-----> <-----Random Dataset 1-----> Iteration: 1--> LLH: -3.26381 Iteration: 2--> LLH: -2.88868 Iteration: 3--> LLH: -2.69257 Iteration: 4--> LLH: -2.58041 Iteration: 5--> LLH: -2.51447 <-----Random Dataset 2-----> Iteration: 1--> LLH: -3.47266 Iteration: 2--> LLH: -3.7416 Iteration: 3--> LLH: -3.9588 <-----Proximate ligation events extracted from mixture model-----> <-----Starting Frequency Generation----->

yufanzhouonline commented 3 years ago

Hi yaojiayingJenny,

If you just only run one chromosome with HISIF, please also make the empty file for other chromosomes. You can run the shell like the following:

Linux Shell to make all empty files

for chrno in $(seq 2 23)
do
        touch chr${chrno}.tmp
done

Thanks.

yaojiayingJenny commented 3 years ago

Hi yaojiayingJenny,

If you just only run one chromosome with HISIF, please also make the empty file for other chromosomes. You can run the shell like the following:

Linux Shell to make all empty files

for chrno in $(seq 2 23) do touch chr${chrno}.tmp done

Thanks.

Thanks for your reply.

Actually my input contain 25 files which were generated by "proc" command(HiSIF_V1.00/bin/proc split/ test/ -t) from a validPairs file. (<-----Found 25 files----->) $ls test chr10.tmp chr12.tmp chr14.tmp chr16.tmp chr18.tmp chr1.tmp chr21.tmp chr23.tmp chr25.tmp chr3.tmp chr5.tmp chr7.tmp chr9.tmp chr11.tmp chr13.tmp chr15.tmp chr17.tmp chr19.tmp chr20.tmp chr22.tmp chr24.tmp chr2.tmp chr4.tmp chr6.tmp chr8.tmp

I also tried your suggestion with 23 files(chr1.tmp to chr23.tmp) and 24 files (chr1.tmp to chr24.tmp), while neither of them succeeded.

Please let me know if there are any other rules need to follow, or could you upload the test data with the running command If it's convenient for you ? Thanks a lot.

Here're other information about my command: -g hg19_bowtie2_index (my index path) ├── hg19.1.bt2 ├── hg19.2.bt2 ├── hg19.3.bt2 ├── hg19.4.bt2 ├── hg19.rev.1.bt2 └── hg19.rev.2.bt2 chromosome information is: chr1 ... chr22 chrX

-c hg19.MboI.bed(from HiSIF_V1.00/resources/ after "cat" command) chr1 ... chr22 chrM chrX chrY

yufanzhouonline commented 3 years ago

Hi yaojiayingJenny,

I have uploaded the example data, please find them on:

https://github.com/yufanzhouonline/HiSIF/tree/master/HiSIF_V1.00/example

Please try to run HiSIF with these example data.

Please let me know if it works.

Thanks.

jiayingyao commented 3 years ago

Hi yaojiayingJenny,

I have uploaded the example data, please find them on:

https://github.com/yufanzhouonline/HiSIF/tree/master/HiSIF_V1.00/example

Please try to run HiSIF with these example data.

Please let me know if it works.

Thanks.

Hi yufanzhouonline, Thanks for your example data. I tested them, and got the same "Pair_Frags read error". However, I noticed differential name with parameter of peakThreshold value. Mine is "-T", yours is "-t". Is that the reason we used a different version?

your command: bin/HiSIF -g genome/hg19 -c resources/hg19.HindIII.bed -w 36 500 20000 -p 1 29 -t 1 -i 2 hESC

my command: (-m parameter is forced) bin/HiSIF -g genome/hg19 -c resources/hg19.HindIII.bed -w 36 500 20000 -p 1 29 -T 1 -i 2 -m 2 -o output hESC

Program: HiSIF - HiC Significant Interaction Fragments Version: 1.0 HiSIF [options]

-g <DIR>                        reference genome directory
-p <INT> <INT>              poisson mixture model parameters
-w <INT> <INT> <INT>        readLength, cuttingSiteExtent, fragmentExtent
-T <INT>                        peakThreshold value
-s <0.0-1.0>                percentage of dataset for bootstrapping
-i <INT>                        bootstrapping iterations
-c <FILE>                   cutting sites map .bed file
-o <FILE>                   outputfile
-x                              limit number of child processes used

Experimental: -m <1 or 2> use file i/o to save main memory, 1 saves some, 2 saves more

yufanzhouonline commented 3 years ago

Hi yaojiayingJenny,

Please download the latest version and install HiSIF as mentioned (Just input "make" on the folder of HiSIF)

Then follow my command, not your command.

Thanks.