nservant / HiC-Pro

HiC-Pro: An optimized and flexible pipeline for Hi-C data processing
Other
387 stars 181 forks source link

[bowtie_pairing] Error 1 #482

Open linshengnan09 opened 3 years ago

linshengnan09 commented 3 years ago

Hello! l installed hic-pro by conda , but the step "Pairing of R1 and R2 tags ..." is always wrong, can you tell me how to solve this problem? Thank you! the log Run HiC-Pro 3.1.0

Tue Sep 21 23:35:28 CST 2021 Bowtie2 alignment step1 ... Logs: logs/sample1/mapping_step1.log Logs: logs/sample2/mapping_step1.log


Wed Sep 22 19:25:19 CST 2021 Bowtie2 alignment step2 ... Logs: logs/sample1/mapping_step2.log Logs: logs/sample2/mapping_step2.log


Thu Sep 23 01:51:50 CST 2021 Combine R1/R2 alignment files ... Logs: logs/sample1/mapping_combine.log Logs: logs/sample2/mapping_combine.log


Thu Sep 23 02:58:25 CST 2021 Mapping statistics for R1 and R2 tags ... Logs: logs/sample1/mapping_stats.log Logs: logs/sample2/mapping_stats.log


Thu Sep 23 04:23:58 CST 2021 Pairing of R1 and R2 tags ... Logs: logs/sample1/mergeSAM.log make: *** [bowtie_pairing] Error 1

the mergeSAM.log ~/miniconda3/envs/hic-pro_env/bin/python ~/02_software/hic-pro/HiC-Pro-3.1.0/scripts/mergeSAM.py -q 10 -t -v -f bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.bam -r bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.bam -o bowtie_results/bwt2/sample1/XB_merged_canu_asm.fasta.bwt2pairs.bam [E::idx_find_and_load] Could not retrieve index file for 'bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.bam' [E::idx_find_and_load] Could not retrieve index file for 'bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.bam'

mergeBAM.py

forward= bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.bam

reverse= bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.bam

output= bowtie_results/bwt2/sample1/XB_merged_canu_asm.fasta.bwt2pairs.bam

min mapq= 10

report_single= False

report_multi= False

verbose= True

Merging forward and reverse tags ...

Forward and reverse reads not paired. Check that BAM files have the same read names and are sorted.

nservant commented 3 years ago

Hi, It seems that your R1 and R2 files are not paired. Could you please show me the first lines of your fastq files to double check the read names ? Thanks

linshengnan09 commented 3 years ago

R1: @A00511:346:HLNVMDSXY:1:1101:1054:1000 1:N:0:CAAGTCTA+GCCTTAAT ANTCCCGGAAAGTGCTGAGGTTTGGGCCCCTGAGACGAGAGACGTCAGGATAGACTGGGTTAGCCCCGGTTGGTTTTCAATTTATGAATCATCCTTCAAGTTTG + F#FFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFF

R2: @A00511:346:HLNVMDSXY:1:1101:1054:1000 2:N:0:CAAGTCTA+GCCTTAAT GACTTTGACGTTCGAGCGTGGCATTGGTATGTGGACTGGTGGTATGTTGGTTTGCGGTTTGGTTTGGAAAGATCGATCAAACGCCAGACGGAAGGCATGACTTG + FFFFF::FFFFFFFFFF,FFFFFFFFFFFFFFFFF,FF,FFF,FFFFFFFFF:,FFF,FFFFFFFFFF:FFF:FFFFFFFFFFFFFFFFFFF,:FFF:FFF:FF

nservant commented 3 years ago

I'm wondering if the "1:N:0" on one side, and the "2:N:0" on the other side, could explain the issue. But I though that it has been solved in a previous version ... Before cheking that, you are 100% sur that you have exactly the same reads in both R1/R2 files ? N

linshengnan09 commented 3 years ago

I remember running with the same data before and there was no problem. Later, I ran again and there reported an error , and then I installed the latest version, but the same error was reported, and there was no problem with the data when I ran 3d-dna.

nservant commented 3 years ago

Would you mind sharing with me these two files please (private message) ; bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.bam bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.bam

Or maybe just a few thousand reads. Thanks

linshengnan09 commented 3 years ago

How can I sharing with you these two files ? The file is too big, shall I split it up and send it by mail?

nservant commented 3 years ago

yes, or use a share data system such as weTransfer for instance

linshengnan09 commented 3 years ago

ok, Could you give me your email address?

linshengnan09 commented 3 years ago

ok,I have sent the file to you via weTransfer, please check.

nservant commented 3 years ago

got them. I'll come back to you N

nservant commented 3 years ago

Sorry I cannot open the files. These are truncated SAM files, but I can't transform them in BAM files. could you send the complete file ?

linshengnan09 commented 3 years ago

Sorry, the size of the complete file is 30G, so I converted the file to bam again, and send to you.

nservant commented 3 years ago

ok, so there is indeed an issue with the order of the reads which is different from the two files. So either, there is something wrong with the fastq files, or I'm wondering if something went wront at a sorting step ...

nservant commented 3 years ago

The error occurs line 187 in the R1 file ;

>>samtools view XB_R2_merged_canu_asm.fasta.bwt2merged.bam.bam | awk '{print $1}' | head -190 | tail
A00511:346:HLNVMDSXY:1:1101:31882:1000
A00511:346:HLNVMDSXY:1:1101:31937:1000
A00511:346:HLNVMDSXY:1:1101:32081:1000
A00511:346:HLNVMDSXY:1:1101:32136:1000
A00511:346:HLNVMDSXY:1:1101:32316:1000
A00511:346:HLNVMDSXY:1:1101:32515:1000
A00511:346:HLNVMDSXY:1:1101:1063:1016
A00511:346:HLNVMDSXY:1:1101:1118:1016
A00511:346:HLNVMDSXY:1:1101:1262:1016
A00511:346:HLNVMDSXY:1:1101:1768:1016
(/data/users/nservant/projects_analysis/kdi_home/conda/hic-pro-3.1.0) 
nservant@u900-bdd-1-203n-6985:/data/tmp/nservant/hic-pro$
>>samtools view XB_R1_merged_canu_asm.fasta.bwt2merged.bam.bam | awk '{print $1}' | head -190 | tail
A00511:346:HLNVMDSXY:1:1101:31882:1000
A00511:346:HLNVMDSXY:1:1101:31937:1000
A00511:346:HLNVMDSXY:1:1101:32081:1000
A00511:346:HLNVMDSXY:1:1101:32136:1000
A00511:346:HLNVMDSXY:1:1101:32316:1000
A00511:346:HLNVMDSXY:1:1101:32515:1000
A00511:346:HLNVMDSXY:1:1101:32786:1000
A00511:346:HLNVMDSXY:1:1101:1118:1016
A00511:346:HLNVMDSXY:1:1101:1985:1016
A00511:346:HLNVMDSXY:1:1101:2040:1016

You see that the read order start to change ...

linshengnan09 commented 3 years ago

It should be the sorting step , the bam files cannot be sorted, but how to solve this problem? I had change the 1.12 samtools version to 1.11, still did not solve the problem.

nservant commented 3 years ago

I do not think it is linked to the samtools version, but more to the RAM you are using the sort the data (or the disk space if the sort command is swapping). Would you have message related to sort in the log file when it merges the two mapping steps ?

linshengnan09 commented 3 years ago

I check the mapping_combine.log:

~/01_bin/samtools merge -@ 36 -n -f bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.bam bowtie_results/bwt2_global/sample1/XB_R2_merged_canu_asm.fasta.bwt2glob.bam bowtie_results/bwt2_local/sample1/XB_R2_merged_canu_asm.fasta.bwt2glob.unmap_bwt2loc.bam ~/01_bin/samtools merge -@ 36 -n -f bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.bam bowtie_results/bwt2_global/sample1/XB_R1_merged_canu_asm.fasta.bwt2glob.bam bowtie_results/bwt2_local/sample1/XB_R1_merged_canu_asm.fasta.bwt2glob.unmap_bwt2loc.bam ~/01_bin/samtools sort -@ 36 -m 21M -n -T tmp/XB_R1_merged_canu_asm.fasta -o bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.sorted.bam bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.bam ~/01_bin/samtools sort -@ 36 -m 21M -n -T tmp/XB_R2_merged_canu_asm.fasta -o bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.sorted.bam bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.bam [bam_sort_core] merging from 6120 files and 36 in-memory blocks... [E::hts_open_format] Failed to open file "tmp/XB_R1_merged_canu_asm.fasta.1018.bam" : Too many open files samtools sort: fail to open "tmp/XB_R1_merged_canu_asm.fasta.1018.bam": Too many open files [bam_sort_core] merging from 6120 files and 36 in-memory blocks... [E::hts_open_format] Failed to open file "tmp/XB_R2_merged_canu_asm.fasta.1018.bam" : Too many open files samtools sort: fail to open "tmp/XB_R2_merged_canu_asm.fasta.1018.bam": Too many open files

nservant commented 3 years ago

yes well done ! So the samtools sort failed. If you look at your command, -@ 36 -m 21M, means that it only has 21Mo to sort the file which is too few. So it has to swap a lot, and generate too many tmp files.

This memory parameter is in your configuration file SORT_RAM. By default, it is set to 1000M, so I guess you change it to 21. Please try to increase this RAM parameter.

linshengnan09 commented 3 years ago

Is the parameter in the bowtie_combine.sh script ? bowtie_combine.sh:

Set a default for legacy config files that do not have SORT_RAM set

if [[ "${SORT_RAM}" == "" ]]; then
   SORT_RAM="768"
fi
nservant commented 3 years ago

Yes, but you have it in the config-hicpro.txt file. You do not need to modify the code.

linshengnan09 commented 3 years ago

ok, I found that this parameter setting is missing in my config-hicpro.txt file

nservant commented 3 years ago

Ah ok, maybe that's an old config file. But that's strange, because as you pointed out, in this case, it should be fixed to 768 ... so I do not really understand why you have 21 in your log

linshengnan09 commented 3 years ago

I had set SORT_RAM to 1000M in in the config-hicpro.txt, but it did'nt work. the mapping_combine.log: ~/01_bin/samtools merge -@ 36 -n -f bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.bam bowtie_results/bwt2_global/sample1/XB_R1_merged_canu_asm.fasta.bwt2glob.bam bowtie_results/bwt2_local/sample1/XB_R1_merged_canu_asm.fasta.bwt2glob.unmap_bwt2loc.bam ~/01_bin/samtools merge -@ 36 -n -f bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.bam bowtie_results/bwt2_global/sample1/XB_R2_merged_canu_asm.fasta.bwt2glob.bam bowtie_results/bwt2_local/sample1/XB_R2_merged_canu_asm.fasta.bwt2glob.unmap_bwt2loc.bam ~/01_bin/samtools sort -@ 36 -m 27M -n -T tmp/XB_R1_merged_canu_asm.fasta -o bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.sorted.bam bowtie_results/bwt2/sample1/XB_R1_merged_canu_asm.fasta.bwt2merged.bam ~/01_bin/samtools sort -@ 36 -m 27M -n -T tmp/XB_R2_merged_canu_asm.fasta -o bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.sorted.bam bowtie_results/bwt2/sample1/XB_R2_merged_canu_asm.fasta.bwt2merged.bam [bam_sort_core] merging from 4752 files and 36 in-memory blocks... [E::hts_open_format] Failed to open file "tmp/XB_R1_merged_canu_asm.fasta.1018.bam" : Too many open files samtools sort: fail to open "tmp/XB_R1_merged_canu_asm.fasta.1018.bam": Too many open files [bam_sort_core] merging from 4752 files and 36 in-memory blocks... [E::hts_open_format] Failed to open file "tmp/XB_R2_merged_canu_asm.fasta.1018.bam" : Too many open files samtools sort: fail to open "tmp/XB_R2_merged_canu_asm.fasta.1018.bam": Too many open files

nservant commented 3 years ago

Hi There is something wrong, if you look at your new logs, you are still using -m 27M N

nservant commented 3 years ago

Sorry, I know what's going on ! Actually, the SORT_RAM parameter is divided by the number of CPUs For instance, using 1000M with 4 CPUs means that samtools sort is run with 250M of RAM. So it makes sense ... you have 1000M / 36 CPU = 27M of RAM.

I would suggest to decrease the number of CPU to 8 for instance ... this is enough ! or to increase again the SORT_RAM parameter. Best

seasky002002 commented 2 years ago

Hi I also have the same problem. But the combine.log only has two lines: $ more result/logs/rep1/mapping_combine.log /usr/local/anaconda/bin/samtools merge -@ 2 -n -f bowtie_results/bwt2/rep1/SRR401 5027_pass_2_hg19.bwt2merged.bam bowtie_results/bwt2_global/rep1/SRR4015027_pass_2 _hg19.bwt2glob.bam bowtie_results/bwt2_local/rep1/SRR4015027_pass_2_hg19.bwt2glob .unmap_bwt2loc.bam /usr/local/anaconda/bin/samtools merge -@ 2 -n -f bowtie_results/bwt2/rep1/SRR401 5027_pass_1_hg19.bwt2merged.bam bowtie_results/bwt2_global/rep1/SRR4015027_pass_1 _hg19.bwt2glob.bam bowtie_results/bwt2_local/rep1/SRR4015027_pass_1_hg19.bwt2glob .unmap_bwt2loc.bam

Would you please help me to debug?

linshengnan09 commented 2 years ago

您好,您的邮件我已收到。祝生活愉快,工作顺利!

seasky002002 commented 2 years ago

and here is the main log... Thu Jan 27 20:11:41 CST 2022 Combine R1/R2 alignment files ... Logs: logs/rep1/mapping_combine.log make: *** [/usr/local/bin/HiC-Pro_2.11.4/bin/../scripts//Makefile:115: bowtie_com bine] Error 129 Hangup