epam / fonda

Fonda is a framework which offers scalable and automatic analysis of multiple NGS sequencing data types
Apache License 2.0
8 stars 3 forks source link

Combine PicardMarkDuplicate and PicardRemoveDuplicate into single tool #197

Open syansanofi opened 3 years ago

syansanofi commented 3 years ago

Is it possible to merge these two tools into a single one? For example:

picard MarkDuplicates REMOVE_DUPLICATES=true

can be used.

Is there another situation where we need a separate samtools call to mark the duplicate?

Additionally CREATE_INDEX=true can also be used to replace the indexing of duplicate reads

picard MarkDuplicates REMOVE_DUPLICATES=true --CREATE_INDEX=true

kamyshova commented 3 years ago

@syansanofi Yes, it's possible. We apply PicardRemoveDuplicate if rmdup flag is on or these are scRnaExpression_Fastq or capture or WGS workflows. Should we preserve this logic?