Open seifudd opened 8 months ago
Hi,
I am trying to use the demultiplexed BAM output from zUMIs with Picard Tools but, it does not seem to be working.
Below are a few lines from a demultiplexed BAM file (one sample) output from zUMIs:
A00267:423:HFMHMDRX3:1:2101:1000:34663 99 19 48452905 255 88M = 48453094 277 GCTGTTCGTGCACCAGGGCGAGACCGAGCTGAAGGAGCTGCACT GGCACCCGCAGTGCCCAGGGCTCCTGGTCAGCACGGCGCTGTCA FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF NH:i:1 HI:i:1 AS:i :174 nM:i:0 BX:Z:AGTGACCTCTCCTAGA BC:Z:AGTGACCTCTCCTAGA UB:Z:CCCGCANCAGCTCTGGATCAGAGC XS:Z:Assigned2 XN:i:1 XT:Z:ENSG00000105447 A00267:423:HFMHMDRX3:1:2101:1000:34663 147 19 48453094 255 88M = 48452905 -277 GGTTCATTCAGGTCTGTTGACTGAGACTGGCCGGCCTGTGGGCT GCCGTGATGGATTCTGTTTGACGTATTGTTCTCTAGAAGGCCTG FFFFFFFF:FFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF NH:i:1 HI:i:1 AS:i :174 nM:i:0 BX:Z:AGTGACCTCTCCTAGA BC:Z:AGTGACCTCTCCTAGA UB:Z:CCCGCANCAGCTCTGGATCAGAGC XS:Z:Assigned2 XN:i:1 XT:Z:ENSG00000105447 A00267:423:HFMHMDRX3:1:2101:1000:35978 83 2 105342985 255 88M = 105339692 -3381 CCAGTAATGCCTTTAGAAAATTATCAAATTCCTCTTCGAGTGTT TCACCCCTAATTTTGTCTTCCAATTTGCCTGTGAACAATAAAAC FFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF NH:i:1 HI:i:1 AS:i :175 nM:i:0 BX:Z:AGTGACCTCTCCTAGA BC:Z:AGTGACCTCTCCTAGA UB:Z:TTATTGTGTTCCCGAAGAATAGAT XS:Z:Assigned3 XN:i:1 XT:Z:ENSG00000135974 A00267:423:HFMHMDRX3:1:2101:1000:35978 163 2 105339692 255 58M3098N30M = 105342985 3381 GGGGGAAAATGATGGAAAAGAAAAGAGAACAACATG AGATTAAAAATGAGACTAAAAGGAGTAGCACTGTAGATGGGTTAAGGAAAAG FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,F:FFF NH:i:1 HI:i :1 AS:i:175 nM:i:0 BX:Z:AGTGACCTCTCCTAGA BC:Z:AGTGACCTCTCCTAGA UB:Z:TTATTGTGTTCCCGAAGAATAGAT XS:Z:Assigned3 XN:i:1 XT:Z:ENSG00000135974 A00267:423:HFMHMDRX3:1:2101:1000:36166 83 16 69718137 255 88M = 69713114 -5111 GGTCTGCGGCTTCCAGCTTCTTTTGTTCAGCCACAATATCTGGG CTCAGATGGCCTTCTTTATAAGCCAGAACAGACTCGGCAGGATA :FFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF NH:i:1 HI:i:1 AS:i :175 nM:i:0 BX:Z:AGTGACCTCTCCTAGA BC:Z:AGTGACCTCTCCTAGA UB:Z:GCGAACTTTCAGTGGTGATGGAAA XS:Z:Assigned3 XN:i:1 XT:Z:ENSG00000181019 A00267:423:HFMHMDRX3:1:2101:1000:36166 163 16 69713114 255 16M1834N72M = 69718137 5111 GCACTGCCTTCTTACTCCGGAAGGGTCCTTTGTCAT ACATGGCAGCGTAAGTGTAAGCAAACTCTCCTATGAACACTCGCTCAAACCA FFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFF:F:FFFFFFFFFFFFFFFFFFFFFFFFFFFF,, NH:i:1 HI:i :1 AS:i:175 nM:i:0 BX:Z:AGTGACCTCTCCTAGA BC:Z:AGTGACCTCTCCTAGA UB:Z:GCGAACTTTCAGTGGTGATGGAAA XS:Z:Assigned3 XN:i:1 XT:Z:ENSG00000181019 A00267:423:HFMHMDRX3:1:2101:1009:15107 99 10 26501106 255 6M2085N78M8472N4M = 26511819 10801 TCTCAGGAAGAGGAAGAAGCCCAAGCCA AGGCTGATAAAATTAAGCTGGCGCTGGAAAAACTGAAGGAGGCCAAGGTTAAGAAGCTCG FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF NH:i :1 HI:i:1 AS:i:177 nM:i:0 BX:Z:AGTGACCTCTCCTAGA BC:Z:AGTGACCTCTCCTAGA UB:Z:CCTGAACCTCTCCAAAAAACCTCT XS:Z:Assigned3 XN:i:1 XT:Z:ENSG00000077420 A00267:423:HFMHMDRX3:1:2101:1009:15107 147 10 26511819 255 88M = 26501106 -10801 GATGTTCTGGACAACCTTTTCGAGAAAACTCATTGTGACTGCAA TGTAGACTGGTGTCTTTATGAAATCTACCCGGAACTACAAATTG :FFFFF:FFFF:FFFF,FFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF NH:i:1 HI:i:1 AS:i :177 nM:i:0 BX:Z:AGTGACCTCTCCTAGA BC:Z:AGTGACCTCTCCTAGA UB:Z:CCTGAACCTCTCCAAAAAACCTCT XS:Z:Assigned3 XN:i:1 XT:Z:ENSG00000077420 A00267:423:HFMHMDRX3:1:2101:1009:15515 83 11 66003740 255 88M = 66003676 -152 TGCCTTCGAGAGTGGTGCGACGCCTTCTTGTGATGCTCTCTGGG AAGCTCTCAATCCCCAGCCCTCATCCAGAGTTTGCAGCCGAGTA FFFFFFFFFFF:FFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF NH:i:1 HI:i:1 AS:i :173 nM:i:0 BX:Z:AGTGACCTCTCCTAGA BC:Z:AGTGACCTCTCCTAGA UB:Z:GGGAGGAGTCCCAGATGAAGACCT XS:Z:Assigned3 XN:i:1 XT:Z:ENSG00000175334 A00267:423:HFMHMDRX3:1:2101:1009:15515 163 11 66003676 255 87M1S = 66003740 152 CTTCCGGGAATGGCTGAAAGACACTTGTGGCGCCAACGCCAAGC AGTCCCGGGACTGCTTCGGATGCCTTCGAGAGTGGTGCGACGCG FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFF:FFF:FFFFFFFFFF, NH:i:1 HI:i:1 AS:i :173 nM:i:0 BX:Z:AGTGACCTCTCCTAGA BC:Z:AGTGACCTCTCCTAGA UB:Z:GGGAGGAGTCCCAGATGAAGACCT XS:Z:Assigned3 XN:i:1 XT:Z:ENSG00000175334
Below is the output from Picard Tools CollectAlignmentSummaryMetrics run, assuming BAM files are coordinate sorted:
## htsjdk.samtools.metrics.StringHeader # CollectAlignmentSummaryMetrics EXPECTED_PAIR_ORIENTATIONS=[] INPUT=Tunic.AGTGACCTCTCCTAGA.demx.bam OUTPUT=Tunic.AGTGACCTCTCCTAGA.demx.summary.metrics.txt MAX_INSERT_SIZE=100000 ADAPTER_SEQUENCE=[AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT, AGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG, AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT, AGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTG, AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT, AGATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG] METRIC_ACCUMULATION_LEVEL=[ALL_READS] IS_BISULFITE_SEQUENCED=false ASSUME_SORTED=true STOP_AFTER=0 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false ## htsjdk.samtools.metrics.StringHeader # Started on: Thu Nov 09 00:35:11 EST 2023 ## METRICS CLASS picard.analysis.AlignmentSummaryMetrics CATEGORY TOTAL_READS PF_READS PCT_PF_READS PF_NOISE_READS PF_READS_ALIGNED PCT_PF_READS_ALIGNED PF_ALIGNED_BASES PF_HQ_ALIGNED_READS PF_HQ_ALIGNED_BASES PF_HQ_ALIGNED_Q20_BASES PF_HQ_MEDIAN_MISMATCHES PF_MISMATCH_RATE PF_HQ_ERROR_RATE PF_INDEL_RATE MEAN_READ_LENGTH READS_ALIGNED_IN_PAIRS PCT_READS_ALIGNED_IN_PAIRS PF_READS_IMPROPER_PAIRS PCT_PF_READS_IMPROPER_PAIRS BAD_CYCLES STRAND_BALANCE PCT_CHIMERAS PCT_ADAPTER SAMPLE LIBRARY READ_GROUP FIRST_OF_PAIR 71490794 71490794 1 62694189 0 0 0 0 0 0 0 0 0 0 88 0 0 0 0 0 0 0 0.003427 SECOND_OF_PAIR 71475327 71475327 1 62037196 0 0 0 0 0 0 0 0 0 0 88 0 0 0 0 0 0 0 0.000061 PAIR 142966121 142966121 1 124731385 0 0 0 0 0 0 0 0 0 0 88 0 0 0 0 0 0 0 0.001744
There is no output for PF_READS_ALIGNED? Everything seems to be going to PF_NOISE_READS.
The same behavior happens when I try to use the <>.filtered.tagged.Aligned.out.bam
Am I missing something? I thought that the BAM files output from zUMIs were compatible with Picard tools etc.
Attached is the yaml file:
Tunic.zUMIs_config_formated.yaml.txt
Attached is the command line log file output from zUMIs:
Tunic.command_line_output_zummis.txt
Thank you for your help. Appreciate it.
Thanks, Fayaz
Not sure here because the BAM file output you show looks properly formatted. Maybe Picard tools expects a sorted file? Try running samtools sort prior to your Picard command.
Best, Christoph
Hi,
I am trying to use the demultiplexed BAM output from zUMIs with Picard Tools but, it does not seem to be working.
Below are a few lines from a demultiplexed BAM file (one sample) output from zUMIs:
Below is the output from Picard Tools CollectAlignmentSummaryMetrics run, assuming BAM files are coordinate sorted:
There is no output for PF_READS_ALIGNED? Everything seems to be going to PF_NOISE_READS.
The same behavior happens when I try to use the <>.filtered.tagged.Aligned.out.bam
Am I missing something? I thought that the BAM files output from zUMIs were compatible with Picard tools etc.
Attached is the yaml file:
Tunic.zUMIs_config_formated.yaml.txt
Attached is the command line log file output from zUMIs:
Tunic.command_line_output_zummis.txt
Thank you for your help. Appreciate it.
Thanks, Fayaz