nservant / HiC-Pro

HiC-Pro: An optimized and flexible pipeline for Hi-C data processing
Other
382 stars 183 forks source link

Pairing of R1 and R2 tags #427

Closed dsrini26 closed 3 years ago

dsrini26 commented 3 years ago

Hi,

I am having an issue with bowtie pairing of R1 and R2 tags. This is the error it quits with- **"Pairing of R1 and R2 tags ... Logs: logs/F7/mergeSAM.log make: * [/home//HiC-Pro_2.11.1/bin/../scripts//Makefile:144: bowtie_pairing] Error 1"

I looked in the mergeSAM.log file and it displays this -

"/usr/bin/python /home/HiC-Pro_2.11.1/scripts/mergeSAM.py -q 30 -t -v -f bowtie_results/bwt2/F7/F7_S117_L001_R1_001_mm10genome.bwt2merged.bam -r bowtie_results/bwt2/F7/F7_S117_L001_R2_001_mm10genome.bwt2merged.bam -o bowtie_results/bwt2/F7/F7_S117_L001_mm10genome.bwt2pairs.bam File "/home/HiC-Pro_2.11.1/scripts/mergeSAM.py", line 26 print "Usage : python mergeSAM.py" ^ SyntaxError: Missing parentheses in call to 'print'. Did you mean print("Usage : python mergeSAM.py")?"

Not sure what the problem is - ihave used HiC-Pro numerous times in the past and have not had this issue before. Any advice?

Thank you for your help!

mdozmorov commented 3 years ago

There are several closed issues on this, like #308. I tried runs on less processors, more memory - same issue. Didn't experience it with v.2.11, now using 3.0.

The log mapping_combine.log indicates there's an issue with samtools sort, like:

/home/user/miniconda3/envs/HiC-Pro_v3.0.0/bin/samtools sort -@ 8 -m 96M -n -T tmp/SAMPLE1_R1_hg38 -o bowtie_results/bwt2/SAMPLE1/SAMPLE1_R1_hg38.bwt2merged.sorted.bam bowtie_results/bwt2/SAMPLE1/SAMPLE1_R1_hg38.bwt2merged.bam
...
[bam_sort_core] merging from 2280 files and 8 in-memory blocks...
[E::hts_open_format] Failed to open file tmp/SAMPLE1_R1_hg38.1020.bam

Consequently, mergeSAM.log tells

/home/user/miniconda3/envs/HiC-Pro_v3.0.0/bin/python /home/user/.local/HiC-Pro_3.0.0/scripts/mergeSAM.py -q 0 -t -v -f bowtie_results/bwt2/SAMPLE1/SAMPLE1_R1_hg38.bwt2merged.bam -r bowtie_results/bwt2/SAMPLE1/SAMPLE1_R2_hg38.bwt2merged.bam -o bowtie_results/bwt2/SAMPLE1/SAMPLE1_hg38.bwt2pairs.bam
## mergeBAM.py
## forward= bowtie_results/bwt2/SAMPLE1/SAMPLE1_R1_hg38.bwt2merged.bam
## reverse= bowtie_results/bwt2/SAMPLE1/SAMPLE1_R2_hg38.bwt2merged.bam
## output= bowtie_results/bwt2/SAMPLE1/SAMPLE1_hg38.bwt2pairs.bam
## min mapq= 0
## report_single= False
## report_multi= False
## verbose= True
## Merging forward and reverse tags ...
Forward and reverse reads not paired. Check that BAM files have the same read names and are sorted.

Both logs provide commands. I'm trying to run them individually, sort first, then merge. That, hopefully, will complete the first step. @dsrini26, try this approach as well.

mdozmorov commented 3 years ago

The following steps helped to complete Step 1.

FASTQFILE=inputfiles_PDX.txt; export FASTQFILE

# Sort BAM files
/home/user/miniconda3/envs/HiC-Pro_v3.0.0/bin/samtools sort -@ 8 -n -T tmp/SAMPLE_R1_hg38 -o bowtie_results/bwt2/SAMPLE/SAMPLE_R1_hg38.bwt2merged.sorted.bam bowtie_results/bwt2/SAMPLE/SAMPLE_R1_hg38.bwt2merged.bam #  -m 96M
/home/user/miniconda3/envs/HiC-Pro_v3.0.0/bin/samtools sort -@ 8 -n -T tmp/SAMPLE_R2_hg38 -o bowtie_results/bwt2/SAMPLE/SAMPLE_R2_hg38.bwt2merged.sorted.bam bowtie_results/bwt2/SAMPLE/SAMPLE_R2_hg38.bwt2merged.bam #  -m 96M

# Merge them
/home/user/miniconda3/envs/HiC-Pro_v3.0.0/bin/python /home/user/.local/HiC-Pro_3.0.0/scripts/mergeSAM.py -q 0 -t -v -f bowtie_results/bwt2/SAMPLE/SAMPLE_R1_hg38.bwt2merged.sorted.bam -r bowtie_results/bwt2/SAMPLE/SAMPLE_R2_hg38.bwt2merged.sorted.bam -o bowtie_results/bwt2/SAMPLE/SAMPLE_hg38.bwt2pairs.bam

# Create results subfolders
mkdir -p hic_results/data/SAMPLE
mkdir -p hic_results/matrix/SAMPLE
mkdir -p hic_results/pic/SAMPLE
mkdir -p hic_results/stats/SAMPLE

# Convert BAM
/home/user/miniconda3/envs/HiC-Pro_v3.0.0/bin/python /home/user/.local/HiC-Pro_3.0.0/scripts/mapped_2hic_fragments.py -v -S -t 100 -m 100000 -s 100 -l 600 -a -f /home/user/output/hg38_GATC_GANTC.bed -r bowtie_results/bwt2/SAMPLE/SAMPLE_hg38.bwt2pairs.bam -o hic_results/data/SAMPLE

Then, Step 2 also ran successfully.

nservant commented 3 years ago

Hi all Regarding the first error reported by @dsrini26, this is just a python version error. I guess he is using python3 with HiC-Pro 2.X.X which only supports python2. For python3, please move to HiC-Pro 3.X.X Best