broadinstitute / picard

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
https://broadinstitute.github.io/picard/
MIT License
984 stars 368 forks source link

Should CollectHsMetrics also emit PCT@5x? #852

Open yfarjoun opened 7 years ago

yfarjoun commented 7 years ago

It currently emits 1,2,10,20,30,40,50,100

there seems to be at least one user who's interested, #851. But I'm not sure if thats good enough evidence for changing the summary metric.

Alternatively, we could include in the histogram PCT@ANYx which would be less changes to the API.

opinions?

@ktibbett @jacarey @tfenne @nh13

tfenne commented 7 years ago

I agree with your thought here and elsewhere that it would be much nicer to emit the histogram. We could even emit a pair of histograms that are either counts (or proportions) of bases @ nX and >= nX to make it so that users don't have to do too much math.

nh13 commented 7 years ago

I think making the metrics extensible (i.e. the user can specify the coverage thresholds) has sailed, so I think the histogram would be nicer, since then we don't need to change the summary metrics.

jacarey commented 7 years ago

Agreed.. from a production standpoint I'd prefer to leave the summary metrics alone.

yfarjoun commented 7 years ago

sounds good! I'll tag this as a easy task for someone to do when they want to learn coding for picard....