amplab / snap

Scalable Nucleotide Alignment Program -- a fast and accurate read aligner for high-throughput sequencing data
https://www.microsoft.com/en-us/research/project/snap/
Apache License 2.0
287 stars 66 forks source link

Add option to add FASTQ comment to SAM output. #168

Closed ghuls closed 8 months ago

ghuls commented 8 months ago

Add option to add FASTQ comment to SAM output.

For example BWA supports this with the -C option:

-C  Append append FASTA/Q comment to SAM output. 
        This option can be used to transfer read meta information (e.g. barcode) to the SAM output.
        Note that the FASTA/Q comment (the string after a space in the header line) must conform the
        SAM spec (e.g. BC:Z:CGTAC). Malformated comments lead to incorrect SAM output.

This is useful when cell barcode information (CB tag) was added to the FASTQ read names in the comment part (read name + space + SAM tags separated by TABs):

@A00305:504:HLMGMDRXY:1:2101:23023:1031 CR:Z:CAACGGCCACACATGT   CY:Z:FFFFFFFFFFFFFFFF   CB:Z:CAACGGCCACACATGT-1
GTCTTGGCTTTCTGTGCGGAAGTGGGGCTGGCTGGCATAGAATTCCTTTG
+
FFFFFFFF,:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFF
@A00305:504:HLMGMDRXY:1:2101:24415:1031 CR:Z:TGTAAGCAGACACAAA   CY:Z:FFFFFFFFFFFFFF,,   CB:Z:TGTAAGCAGACACAAT-1
ATGTCACACTCTGTGTCTTCTGCTAGCCCAGTCCTGTTTGGCAGCTCTAG
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
bolosky commented 8 months ago

Check out the -pfc (preserve FASTQ comments) flag. Let me know if that doesn't do what you want.

ghuls commented 8 months ago

Thanks, that is the option I was looking for!