The alignment ratio of my raw data R1 and R2 are both around 65, which I know is not particularly ideal. However, in the read pairing step, according to the script mergeSAM.py, the bam obtained after merging R1_hg38.bwt2merged.bam and R2_hg38.bwt2merged.bam has too many Pairs_with_singleton, accounting for about 55%, according to the statistical file. I checked the matching and naming of my files and there is no problem. Hi-C data was built using Arima Kit. I would like to know whether the restriction enzyme information is used in the read pairing step? I think not. Do you know what is the reason? Thank you!
Below is my configure file and the statistical information about bwt2pairs.bam.
Please change the variable settings below if necessary
A great tool for Hi-C data analysis.
The alignment ratio of my raw data R1 and R2 are both around 65, which I know is not particularly ideal. However, in the read pairing step, according to the script mergeSAM.py, the bam obtained after merging R1_hg38.bwt2merged.bam and R2_hg38.bwt2merged.bam has too many Pairs_with_singleton, accounting for about 55%, according to the statistical file. I checked the matching and naming of my files and there is no problem. Hi-C data was built using Arima Kit. I would like to know whether the restriction enzyme information is used in the read pairing step? I think not. Do you know what is the reason? Thank you! Below is my configure file and the statistical information about bwt2pairs.bam.
Please change the variable settings below if necessary
#########################################################################
Paths and Settings - Do not edit !
#########################################################################
TMP_DIR = tmp LOGS_DIR = logs BOWTIE2_OUTPUT_DIR = bowtie_results MAPC_OUTPUT = hic_results RAW_DIR = rawdata
#######################################################################
SYSTEM AND SCHEDULER - Start Editing Here !!
####################################################################### N_CPU = 50 SORT_RAM = 30000M LOGFILE = hicpro.log
JOB_NAME = JOB_MEM = JOB_WALLTIME = JOB_QUEUE = JOB_MAIL =
#########################################################################
Data
#########################################################################
PAIR1_EXT = _R1 PAIR2_EXT = _R2
#######################################################################
Alignment options
#######################################################################
MIN_MAPQ = 10
BOWTIE2_IDX_PATH = /mnt/d/Reference/hg38/bowtie2_index/ BOWTIE2_GLOBAL_OPTIONS = --very-sensitive -L 30 --score-min L,-0.6,-0.2 --end-to-end --reorder BOWTIE2_LOCAL_OPTIONS = --very-sensitive -L 20 --score-min L,-0.6,-0.2 --end-to-end --reorder
#######################################################################
Annotation files
#######################################################################
REFERENCE_GENOME = hg38 GENOME_SIZE = /mnt/d/Softwares/HiC-Pro_3.1.0/annotation/chrom_hg38.sizes
#######################################################################
Allele specific analysis
#######################################################################
ALLELE_SPECIFIC_SNP =
#######################################################################
Capture Hi-C analysis
#######################################################################
CAPTURE_TARGET = REPORT_CAPTURE_REPORTER = 1
#######################################################################
Digestion Hi-C
#######################################################################
GENOME_FRAGMENT = /mnt/d/Softwares/HiC-Pro_3.1.0/annotation/ArimaKit_redfrag_hg38.bed LIGATION_SITE = GATCGATC,GANTGATC,GANTANTC,GATCANTC MIN_FRAG_SIZE = MAX_FRAG_SIZE = MIN_INSERT_SIZE = MAX_INSERT_SIZE =
#######################################################################
Hi-C processing
#######################################################################
MIN_CIS_DIST = GET_ALL_INTERACTION_CLASSES = 1 GET_PROCESS_SAM = 0 RM_SINGLETON = 1 RM_MULTI = 1 RM_DUP = 1
#######################################################################
Contact Maps
#######################################################################
BIN_SIZE = 5000 10000 20000 25000 40000 100000 500000 MATRIX_FORMAT = upper
#######################################################################
Normalization
####################################################################### MAX_ITER = 100 FILTER_LOW_COUNT_PERC = 0.02 FILTER_HIGH_COUNT_PERC = 0 EPS = 0.1
########### read information Total_pairs_processed 661159587 100.0 Unmapped_pairs 56087180 8.483 Low_qual_pairs 235828621 35.669 Unique_paired_alignments 312027 0.047 Multiple_pairs_alignments 0 0.0 Pairs_with_singleton 368931759 55.801 Low_qual_singleton 0 0.0 Unique_singleton_alignments 0 0.0 Multiple_singleton_alignments 0 0.0 Reported_pairs 312027 0.047