Closed nesilin closed 7 years ago
Single end reads are reads where the flag indication multiple segments is not set:
1 0x1 template having multiple segments in sequencing
See the spec here: section '1.4 The alignment section: mandatory fields' -> '2. FLAG'
Orphans are reads with the above flag set but only one of the pair was found in the input file.
Thanks!
Hi!
I have a very naive question. When using bamtofastq the BAM file is split into read group1 and read group2 of pair reads and then there are the incomplete reads also coming from group1 and group2.
What exactly "incomplete" refers to in the documentation (https://github.com/gt1/biobambam2/blob/master/src/programs/bamtofastq.1)? Can you please tell whether incomplete means unpaired(=unmatched) reads? Does this have anything to do with unmapped reads?
-outputperreadgroupsuffixO=<_o1.fq> output file name suffix for first mates of incomplete pairs if outputperreadgroup=1. Default is _o1.fq if gz=0 and _o1.fq.gz for gz=1. -outputperreadgroupsuffixO2=<_o2.fq> output file name suffix for second mates of incomplete pairs if outputperreadgroup=1. Default is _o2.fq if gz=0 and _o2.fq.gz for gz=1. outputperreadgroupsuffixS=<_s.fq> -output file name suffix for singled end reads if outputperreadgroup=1. Default is _s.fq if gz=0 and _s.fq.gz for gz=1.
Besides, what is the difference between single end reads and unmatched(orphan) when defining the output files of bamtofastq? Is _s.fastq.gz file the sum of _o1.fastq.gz and *_o2.fastq.gz ? -S=:
output file for single end reads if collation is active
-O=:
output file for unmatched (orphan) first mates if collation is active.
-O2=:
output file for unmatched (orphan) second mates if collation is active.
Thanks!