mskcc / nf-fastq-plus

Generate IGO fastqs, bams, stats and fingerprinting
1 stars 0 forks source link

WGS Samples do not run Mark Duplicates #208

Closed DavidStreid closed 3 years ago

DavidStreid commented 3 years ago

Description: Since dragen-alignment marks duplicates, the Picard command isn't run that would output the .txt file that can be uploaded automatically to the LIMS.

Until this is added, the mark duplicate stats can be extracted from the DRAGEN metrics files written here - /igo/staging/stats/

E.g. RUTH_0036_BHMJFKDSX2/RUTH_0036_BHMJFKDSX2___P09443_CM___116RO_T_IGO_09443_CM_3___GRCh38___HumanWholeGenome.mapping_metrics.csv

Extract the following stats from these *.csv files -

$ cat /igo/staging/stats/RUTH_0036_BHMJFKDSX2/RUTH_0036_BHMJFKDSX2___P09443_CM___116RO_T_IGO_09443_CM_3___GRCh38___HumanWholeGenome.mapping_metrics.csv \
  | grep "MAPPING/ALIGNING SUMMARY,,Number of duplicate marked read"
MAPPING/ALIGNING SUMMARY,,Number of duplicate marked reads,463757354,15.38
DavidStreid commented 3 years ago

Will be addressed by - https://github.com/mskcc/ngs-stats/issues/35