Hi everybody,
we have got 10 bases UMI in the header of fastqfiles. It is codified by RX tag.
We are using GATK to find somatic variants with low VAF.
Unfortunately we found that after fastqtosam command we lost RX tag.
Which is the correct procedure to keep this information also in uBAM?
Hi everybody, we have got 10 bases UMI in the header of fastqfiles. It is codified by RX tag. We are using GATK to find somatic variants with low VAF. Unfortunately we found that after fastqtosam command we lost RX tag. Which is the correct procedure to keep this information also in uBAM?
We used this command contained in GATK WDL file.
picard FastqToSam \ FASTQ="MN18_700.R1.fastq.gz" \ FASTQ2="MN18_700.R2.fastq.gz" \ OUTPUT="ubam.bam" \ SAMPLE_NAME="MN18_700" \ READ_GROUP_NAME="S3771" \ LIBRARY_NAME="LowInput" \ PLATFORM_UNIT="NS500743" \ PLATFORM="illumina" \ SEQUENCING_CENTER="ICH" \ RUN_DATE="2019-06-05" \ SORT_ORDER="queryname"
but when we check for RX tags that are contained in fastq files header in uBAM file, we didn't find find them.
samtools view ubam.bam | less
NS500743:422:HMH25AFXY:1:11101:10000:7645 77 0 0 0 0 AGTTTACGGTTTTATCAATACAGTTTAATACACTTTTAAAGCCATACTTCTGGCATGATCTAGCATTTTCCAGGAAAATAAAAGAGGCACATACTGTAACTGTTGATCATGGGGAGTGATGGATGGGTTCTGCCCTCTGCTAGTGCAAAA AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEA<AEEEAAEEEEAAAEEE RG:Z:S3771 NS500743:422:HMH25AFXY:1:11101:10000:7645 141 0 0 0 0 CTTACCTGTTCCTCTAGACTTTCAGTGATATGTTTTGCACTAGCAGAGGGCAGAACCCATCCATCACTCCCCATGATCAACAGTTACAGTATGTGCCTCTTTTATTTTCCTGGAAAATGCTAGATCATGCCAGAAGTATGGCTTTAAAAG AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEE6/EEEEEEEEEEEEEE<EEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEAEEEEEEEEEEEEEEEEEE< RG:Z:S3771
zcat < MN18_700.R1.fastq.gz | grep -A 3 "NS500743:422:HMH25AFXY:1:11101:10000:7645"
@NS500743:422:HMH25AFXY:1:11101:10000:7645 RX:Z:GGTTACCGGA QX:Z:AAAAAEEEEE AGTTTACGGTTTTATCAATACAGTTTAATACACTTTTAAAGCCATACTTCTGGCATGATCTAGCATTTTCCAGGAAAATAAAAGAGGCACATACTGTAACTGTTGATCATGGGGAGTGATGGATGGGTTCTGCCCTCTGCTAGTGCAAAA + AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEA<AEEEAAEEEEAAAEEE
zcat < MN18_700.R2.fastq.gz | grep -A 3 "NS500743:422:HMH25AFXY:1:11101:10000:7645"
@NS500743:422:HMH25AFXY:1:11101:10000:7645 RX:Z:GGTTACCGGA QX:Z:AAAAAEEEEE CTTACCTGTTCCTCTAGACTTTCAGTGATATGTTTTGCACTAGCAGAGGGCAGAACCCATCCATCACTCCCCATGATCAACAGTTACAGTATGTGCCTCTTTTATTTTCCTGGAAAATGCTAGATCATGCCAGAAGTATGGCTTTAAAAG + AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEE6/EEEEEEEEEEEEEE<EEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEAEEEEEEEEEEEEEEEEEE<
Thanks for any kinds of help
Mat