walaj / bxtools

Tools for analyzing 10X Genomics data
MIT License
42 stars 10 forks source link

SAM to FASTQ functionality #12

Open JustinChu opened 7 years ago

JustinChu commented 7 years ago

Say you are only given a longranger processed BAM file that you would like to use for other purposes. Many tools cannot use BAM format directly. It could be useful to support a fast SAM/BAM conversion to fastq that preserves BX tags, RX tags and even MI tags in the header.

This is pretty easy to script, but I think this could be useful if written efficiently. It would also present the opportunity to encourage some sort of standardized fastq header format for chromium data.

walaj commented 7 years ago

Is there a preferred format for this header? This would be easy to incorporate, just give me the format.

JustinChu commented 7 years ago

Ideally, it would be similar to whatever longranger basic provides, but I haven't figured out where the RX tags are stored in the FASTQ file generated.

Maybe something like this:

@ReadID BX:Z:AAACACCAGAAACCCG-1 RX:Z:AAACACCAGAAACCCG MI:i:22359430

Make sure you make it look at the flag to reverse it back it the alignment has reversed the read.

sjackman commented 6 years ago

You can use samtools fastq -TBX for this purpose.