tgen / tempe

A container focused and streamlined iteration of the Phoenix pipeline
GNU General Public License v3.0
1 stars 1 forks source link

UMI Stats Needed #37

Closed PedalheadPHX closed 1 year ago

PedalheadPHX commented 1 year ago

Can we collect the count of reads in the uBAM after barcode correction and the number of reads in the uBAM after simplex or duplex read collapsing @bryce-turner

@denriquez Can we capture these two into the LIMS and also the read family counts

bryce-turner commented 1 year ago

I'll take a look at this and see about creating a JSON from the output.

Would we want only the count of reads - e.g. we could simply run samtools view -c ubam_after_correction.u.bam or would we want additional metrics (though since these are u.bam files we may not need or want much else at this level). We may want to run CollectDuplexSeqMetrics as well if duplex was enabled - but that could be downstream and part of the bam_qc tasks.

PedalheadPHX commented 1 year ago

yep just a simple samtools view -c input.bam

Just want to see FASTQ_Reads=100 POST_UMI_CORRECTION_Reads=80 POST_READ_COLLAPSING_Reads=64

Percent mapped currently gives some insight but not clear at what step we are losing reads