Closed sanchari24 closed 4 years ago
Could you check in the logs if you have more errors please ?
Hi, so I have these files in the folder /hic_results_28Jan/logs. Only the mergeSAM.log has the error. Rest other files do not report any error.
mergeSAM.log mapping_combine.log mapping_stats.log mapping_step1.log mapping_step2.log trim_CHiC_clf_Root_S8_R1_paired_TAIR10.fasta.bwt2glob.unmap_readsTrimming.log trim_CHiC_clf_Root_S8_R1_paired_bowtie2.log trim_CHiC_clf_Root_S8_R2_paired_TAIR10.fasta.bwt2glob.unmap_bowtie2.log trim_CHiC_clf_Root_S8_R2_paired_bowtie2.log trim_CHiC_clf_Root_S8_R2_paired_TAIR10.fasta.bwt2glob.unmap_bowtie2.log trim_CHiC_clf_Root_S8_R2_paired_TAIR10.fasta.bwt2glob.unmap_readsTrimming.log
The stderr file in my slurm folder has this
Wed 29 Jan 2020 05:18:24 PM CET Pairing of R1 and R2 tags ... Logs: logs/CHiC_clf_Root_S8/mergeSAM.log make: *** [/opt/share/Debian-Buster/HiC-Pro/HiC-Pro_2.11.1/bin/../scripts//Makefile:144: bowtie_pairing] Error 1
Any particular file I need to look into ?
mergeSAM.log please
/usr/bin/python /opt/share/Debian-Buster/HiC-Pro/HiC-Pro_2.11.1/scripts/mergeSAM.py -q 20 -t -v -f Bowtie2/bwt2/CHiC_clf_Root_S8/trim_CHiC_clf_Root_S8_R1_paired_TAIR10.fasta.bwt2merged.bam -r Bowtie2/bwt2/CHiC_clf_Root_S8/trim_CHiC_clf_Root_S8_R2_paired_TAIR10.fasta.bwt2merged.bam -o Bowtie2/bwt2/CHiC_clf_Root_S8/trim_CHiC_clf_Root_S8_TAIR10.fasta.bwt2pairs.bam
Forward and reverse reads not paired. Check that BAM files have the same read names and are sorted.
Forward and reverse reads not paired. Check that BAM files have the same read names and are sorted.
Could you check your read name please ? in theory R1 reads and R2 reads should have the same names ...
Hello,
I checked the first 3000000 lines in the "...fasta.bwt2merged.bam" files in the bwt2 folder and exported them as text files using
samtools view trim_CHiC_clf_Root_S8_R2_paired_TAIR10.fasta.bwt2merged.bam|head -3000000
I made a diff command on R1 vs R2 and R2 vs R1 I get the following differences. I see that sometimes the difference is due to ">" signs and the numbers. Any idea how to resolve this ?
$diff R1.txt R2.txt
2255646d2255645
< NB501040:225:HV3KFBGXC:1:11311:14928:7992 0 2 13012134 42 76M 0 0 GATTAGAATCTTACAATTTATGTGGACCAAGCTTTTACTTTTTCTCTCGAAGAATATTATTTTGTTCAAGAATCAT AAAAAEEEEEEEEEEEEEEEEEEEEAEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:76 YT:Z:UU RG:Z:BMG
2875065d2875063
< NB501040:225:HV3KFBGXC:1:12109:8874:16587 4 0 0 0 0 GGAACCAGCAATCATTGTTCAAAAATCGCAGATAATGGAAGTAATGGAGATATAGCATGTGATGGGTATCACAAAT AA/AA/AEAEE/EEAE6EEEEEEEE/EEEAAEEEEEEEE/AEEEEE<EEEE6EEEEEEE<AEAEEEEEEEE//EE/ YT:Z:UU RG:Z:BML
3000000a2999999,3000000
> NB501040:225:HV3KFBGXC:1:12111:6699:7303 0 3 1632615 42 76M * 0 0 ATCGCAGCGGTTCGACGCGGAATGCAGCTCTTCAGACAGGTAAATTGGAAAATTCCGACAAACAGAACTTTTCTAA AAAAAEEEEEEEEE6EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEE AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:76 YT:Z:UU RG:Z:BMG
> NB501040:225:HV3KFBGXC:1:12111:6699:7722 16 5 25864489 42 76M * 0 0 GAGTCTACATTGTCTGATTTAGGAAACAAATGAGTTACAAGAGAGACTACAGAGAAATTACCAATCAGGAGTAGCT /EAEEAEEEEEE/EEEEEEEEAEEAE/<EEEE</EEEEEAAEEAEEEEEE/EEE/EEEAEEE/6EEEAEEEAAAAA AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:76 YT:Z:UU RG:Z:BMG
$diff R2.txt R1.txt
2255645a2255646
> NB501040:225:HV3KFBGXC:1:11311:14928:7992 0 2 13012134 42 76M * 0 0 GATTAGAATCTTACAATTTATGTGGACCAAGCTTTTACTTTTTCTCTCGAAGAATATTATTTTGTTCAAGAATCAT AAAAAEEEEEEEEEEEEEEEEEEEEAEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:76 YT:Z:UU RG:Z:BMG 2875063a2875065
> NB501040:225:HV3KFBGXC:1:12109:8874:16587 4 0 0 0 0 GGAACCAGCAATCATTGTTCAAAAATCGCAGATAATGGAAGTAATGGAGATATAGCATGTGATGGGTATCACAAAT AA/AA/AEAEE/EEAE6EEEEEEEE/EEEAAEEEEEEEE/AEEEEE<EEEE6EEEEEEE<AEAEEEEEEEE//EE/ YT:Z:UU RG:Z:BML
2999999,3000000d3000000
< NB501040:225:HV3KFBGXC:1:12111:6699:7303 0 3 1632615 42 76M 0 0 ATCGCAGCGGTTCGACGCGGAATGCAGCTCTTCAGACAGGTAAATTGGAAAATTCCGACAAACAGAACTTTTCTAA AAAAAEEEEEEEEE6EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEE AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:76 YT:Z:UU RG:Z:BMG
< NB501040:225:HV3KFBGXC:1:12111:6699:7722 16 5 25864489 42 76M * 0 0 GAGTCTACATTGTCTGATTTAGGAAACAAATGAGTTACAAGAGAGACTACAGAGAAATTACCAATCAGGAGTAGCT /EAEEAEEEEEE/EEEEEEEEAEEAE/<EEEE</EEEEEAAEEAEEEEEE/EEE/EEEAEEE/6EEEAEEEAAAAA AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:76 YT:Z:UU RG:Z:BMG
This is the first time I see that. So far, the only idea would be create new files that fix this issue. For instance, using a simple awk command line, you should be able to remove the "</>"
Hi, so these ">" are not there in the fastq files, but appearing after we get the bam files. If I edit the "....._R1_paired_TAIR10.fasta.bwt2merged.bam" files, how do I launch the HiC-Pro steps with the edited files as the starting point?
That's strange. To my knowledge, bowtie2 is not expected to change the read names ... To run HiC-Pro from the R1/R2.bam files, use the same data organization, and use the stepwise mode of HiC-Pro (-s options)
Hello, I was finally able to make the R1 and R2 files comparable and was able to merge them as shown below. However, the final Valid pair file size is 0 kb. I have attached some of the statistics files in case if you can kindly figure out anything. By the way, my data is Capture HiC. I have pasted the HiC-Pro config file for your reference. Thank you for your time.
HiC-Pro Mapping Statistics CHiC_clf_Shoot_S3/trim_CHiC_clf_Shoot_S3_R1_paired_TAIR10.fasta.mapstat total_R1 64764546 mapped_R1 59933519 global_R1 47827791 local_R1 12105728
HiC-Pro Mapping Statistics CHiC_clf_Shoot_S3/trim_CHiC_clf_Shoot_S3_R2_paired_TAIR10.fasta.mapstat total_R2 64764546 mapped_R2 59933766 global_R2 47828593 local_R2 12105173
CHiC_clf_Shoot_S3.mpairstat merge_statfiles.py dir= Bowtie2/bwt2/CHiC_clf_Shoot_S3/ pattern= *.pairstat Merging 1 files Use /ips2/IPS2 (7864)/DDEVE (8095)/CHROCYC (8096)/sanchari/David_HiChIP_polycomb_project/Capture_HiC_polycomb_mutant_shoot_root/CHiC_clf_Shoot_S3_hic_result/Bowtie2/bwt2/CHiC_clf_Shoot_S3/trim_CHiC_clf_Shoot_S3_TAIR10.fasta.bwt2pairs.pairstat as template Total_pairs_processed 64764546 100.0 Unmapped_pairs 4830779 7.459 Low_qual_pairs 10173345 15.708 Unique_paired_alignments 49760173 76.832 Multiple_pairs_alignments 0 0.0 Pairs_with_singleton 249 0.0 Low_qual_singleton 0 0.0 Unique_singleton_alignments 0 0.0 Multiple_singleton_alignments 0 0.0 Reported_pairs 49760173 76.832
CHiC_clf_Shoot_S3 File
merge_statfiles.py dir= CHiC_clf_Shoot_S3_hic_result/data//CHiC_clf_Shoot_S3/ pattern= *.RSstat Merging 1 files Use /ips2/IPS2 (7864)/DDEVE (8095)/CHROCYC (8096)/sanchari/David_HiChIP_polycomb_project/Capture_HiC_polycomb_mutant_shoot_root/CHiC_clf_Shoot_S3_hic_result/CHiC_clf_Shoot_S3_hic_result/data/CHiC_clf_Shoot_S3/trim_CHiC_clf_Shoot_S3_TAIR10.fasta.bwt2pairs.RSstat as template Valid_interaction_pairs 0 Valid_interaction_pairs_FF 0 Valid_interaction_pairs_RR 0 Valid_interaction_pairs_RF 0 Valid_interaction_pairs_FR 0 Dangling_end_pairs 0 Religation_pairs 0 Self_Cycle_pairs 0 Single-end_pairs 0 Filtered_pairs 499519 Dumped_pairs 49260654
Config File
#########################################################################
#########################################################################
TMP_DIR = tmp LOGS_DIR = logs BOWTIE2_OUTPUT_DIR = Bowtie2 MAPC_OUTPUT = CHiC_clf_Shoot_S3_hic_result RAW_DIR = Trimming_v1
#######################################################################
####################################################################### N_CPU = 16 LOGFILE = hicpro.log
JOB_NAME = Capture_HiC_polycomb_mutant_shoot_root_2Feb JOB_MEM = 60gb JOB_WALLTIME = 6:00:00 JOB_QUEUE = batch JOB_MAIL = sanchari.sircar@u-psud.fr
#########################################################################
#########################################################################
PAIR1_EXT = _R1_paired PAIR2_EXT = _R2_paired
#######################################################################
#######################################################################
FORMAT = phred33 MIN_MAPQ = 20
BOWTIE2_IDX_PATH = /K/CHROCYC/sanchari/Arabido_Bowtie2_indexes/ BOWTIE2_GLOBAL_OPTIONS = --very-sensitive -L 30 --score-min L,-0.6,-0.2 --end-to-end --reorder BOWTIE2_LOCAL_OPTIONS = --very-sensitive -L 20 --score-min L,-0.6,-0.2 --end-to-end --reorder
#######################################################################
#######################################################################
REFERENCE_GENOME = TAIR10.fasta GENOME_SIZE = TAIR10.sizes.genome
#######################################################################
#######################################################################
ALLELE_SPECIFIC_SNP =
#######################################################################
#######################################################################
GENOME_FRAGMENT = DpnII_resfrag_TAIR10.bed LIGATION_SITE = GATC MIN_FRAG_SIZE = 50 MAX_FRAG_SIZE = 100000 MIN_INSERT_SIZE = 100 MAX_INSERT_SIZE =
#######################################################################
#######################################################################
MIN_CIS_DIST = GET_ALL_INTERACTION_CLASSES = 1 GET_PROCESS_SAM = 1 RM_SINGLETON = 1 RM_MULTI = 1 RM_DUP = 1
#######################################################################
#######################################################################
BIN_SIZE = 1000 2000 5000 10000 MATRIX_FORMAT = upper
#######################################################################
####################################################################### MAX_ITER = 100 FILTER_LOW_COUNT_PERC = 0.02 FILTER_HIGH_COUNT_PERC = 0 EPS = 0.1
Dear Nicolas,
I am having the same issue described in this post. The error message I get in logs/sample/mergeSAM.log is:
"Forward and reverse reads not paired. Check that BAM files have the same read names and are sorted"
My _R1 and _R2 files have the very same read names, in the same order, however, the R1_mm10.bwt2merged.bam and R2_mm10.bwt2merged.bam have the read name in different orders, because .bam files are not sorted by name.
Would it be possible for you to fix the pipeline so that the merged bam files are sorted in the proper way?
Hi HiC-Pro is expected to sort the BAM in a proper way before merging them. One assumption could be that the sort crashed for any reason ... do you have message in the log files ? N
Yes, here is my mergeSAM.log file:
/hpcnfs/home/ieo5073/miniconda3/envs/HiCpro/bin/python /hpcnfs/home/ieo5073/miniconda3/envs/HiCpro/bin/HiC-Pro_2.11.3/scripts/mergeSAM.py -q 10 -t -v -f bowtie_results/bw t2/WTRing1B/WTRing1bFPm_00_R1_mm10.bwt2merged.bam -r bowtie_results/bwt2/WTRing1B/WTRing1bFPm_00_R2_mm10.bwt2merged.bam -o bowtie_results/bwt2/WTRing1B/WTRing1bFPm_00_mm1 0.bwt2pairs.bam
Forward and reverse reads not paired. Check that BAM files have the same read names and are sorted.
Thank you very much, Federico
Sorry, I've just found an error message in the mapping_combine.log:
samtools sort: couldn't allocate memory for bam_mem
I guess this explain the issue with the sorting. I will try to fix this issue before proceeding any further.
Thanks, Federico
Yes exactly, BAM are expected to be sorted by name at the step before. The point is why HIC-Pro did not exit with an error status ... strange .. Anyway, you have to fixe the memory issue :)
Hi, I had the same problem. Once you fix the problem, how do you rerun hicpro from that point forward instead of starting the whole thing over?
Hi Stephanie, Which problem ? the sort issue ? You can run HiC-Pro in stepwise mode with the -s option (see the manual for details). In this case, the input (-i) has to be updated for each step. As you did before to create the maps, you provided the .validPairs as an input ... Best
Hello, I am running Hic-Pro (version 2.11.1) on a cluster and it crashed at the following step. Kindly help.
`## mergeBAM.py
forward= Bowtie2/bwt2/CHiC_clf_Root_S8/trim_CHiC_clf_Root_S8_R1_paired_TAIR10.fasta.bwt2merged.bam
reverse= Bowtie2/bwt2/CHiC_clf_Root_S8/trim_CHiC_clf_Root_S8_R2_paired_TAIR10.fasta.bwt2merged.bam
output= Bowtie2/bwt2/CHiC_clf_Root_S8/trim_CHiC_clf_Root_S8_TAIR10.fasta.bwt2pairs.bam
min mapq= 20
report_single= False
report_multi= False
verbose= True
Merging forward and reverse tags ...
1000000
2000000
Forward and reverse reads not paired. Check that BAM files have the same read names and are sorted. ` I checked the trimmed fastq files, they have the same number of reads
head trim_CHiC_clf_Root_S8_R1_paired.fastq
@NB501040:225:HV3KFBGXC:1:11101:10000:6066 1:N:0:CATGGC GGCTTCATGGGCCATTATTGGCCTTTTCTTCTTTTTTTTTTTTTTTTAATTTCCTTGTTTATATAAATTTTTGATA + AAAAAEEEEEEAEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEAEEEEAEE///A/E/E/E////A///AA/E/A/ @NB501040:225:HV3KFBGXC:1:11101:10000:7825 1:N:0:CATGGC TCTTTGACGGTTTATATATAATACACATAAAATGTAAATCTCCCTAACAAAAATGAGTAATGATACATATATACAC + AAAAAEAEEE/EA/EAE6EEEEEEEEEEEAEEEEEEEEEEEEEAEEEEEEEAEEEAEE6AAEAEEEEEE/AEEAAA @NB501040:225:HV3KFBGXC:1:11101:10001:10268 1:N:0:CATGGC GGAAATAATACTTGTACTACAAAAATCAGAAATCTTAGCTGTTTTTGTACTTGATTGTTGGACTCACAGGTGGCGT
head trim_CHiC_clf_Root_S8_R1_paired.fastq
@NB501040:225:HV3KFBGXC:1:11101:10000:6066 1:N:0:CATGGC GGCTTCATGGGCCATTATTGGCCTTTTCTTCTTTTTTTTTTTTTTTTAATTTCCTTGTTTATATAAATTTTTGATA + AAAAAEEEEEEAEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEAEEEEAEE///A/E/E/E////A///AA/E/A/ @NB501040:225:HV3KFBGXC:1:11101:10000:7825 1:N:0:CATGGC TCTTTGACGGTTTATATATAATACACATAAAATGTAAATCTCCCTAACAAAAATGAGTAATGATACATATATACAC + AAAAAEAEEE/EA/EAE6EEEEEEEEEEEAEEEEEEEEEEEEEAEEEEEEEAEEEAEE6AAEAEEEEEE/AEEAAA @NB501040:225:HV3KFBGXC:1:11101:10001:10268 1:N:0:CATGGC GGAAATAATACTTGTACTACAAAAATCAGAAATCTTAGCTGTTTTTGTACTTGATTGTTGGACTCACAGGTGGCGT
My config file has the following PAIR1_EXT = _R1_paired PAIR2_EXT = _R2_pairedAnd I have one folder per sample.