PacificBiosciences / kineticsTools

Tools for detecting DNA modifications from single molecule, real-time sequencing data
19 stars 21 forks source link

The results are mapped to an opposite strand than their alignment, does the argument --referenceWindow remains the same for both strands? #99

Open Marjan-Hosseini opened 6 months ago

Marjan-Hosseini commented 6 months ago

The result of ipdSummary of reads that are mapped to a positive strand are shown to be in the reverse strand and vice versa: My reference is mapped to the positive strand and I checked it with BLAST to make sure of it. I have a bam file and I filter it using samtools to forward and reverse reads as follows:

samtools view -h chr21.bam -F 0xF14 -o chr21+.bam;
samtools view -h chr21.bam -F 0xF04 -f 0x10 -o chr21-.bam

Then I run ipdSummary on both of these files:

REF=GCF_009914755.1_T2T-CHM13v2.0_genomic.fasta
min=0
max=45090682 # (I split the positions to multiple ranges in the actual script)
winid=20
ipdSummary chr21+.bam --reference $REF --referenceWindow $winid:$min-$max --identify m6A,m4C,m5C -j 30 --methylFraction --csv ch21+.csv --gff ch21+.gff
ipdSummary chr21-.bam --reference $REF --referenceWindow $winid:$min-$max --identify m6A,m4C,m5C -j 30 --methylFraction --csv ch21-.csv --gff ch21-.gff

In the results of chr21+.bam, in all '.gff' file the strand seems to be '-', and in the .csv files the strand is '1', and vice versa for the reverse strand. I checked my file to see if it was a result of a typo, but the SAM Flags of reads seem to be in the reverse strand in the chr21-.bam. I am wondering which part I might have done wrong. Am I using the ipdSummary not correctly? Is the --referenceWindow used correctly here? Something must be wrong, I would appreciate it if you could give me a hint specifically the usage of --referenceWindow argument.