It would be awesome if each major/minor release of GATK contained benchmarking results run against a truth set. For calling short germline variants, you could evaluate precision and recall using one of the NIST samples (i.e. HG002) and for calling short somatic variants you could use the SEQC2 tumor-normal pair.
I would like to see how new releases of GATK and how your recommendations/best practices impact precision and recall against a known truth set. This transparency would help anyone using your tools decide if it's worth upgrading to a new release of GATK, or if their implementation of your recommended set of best practices is performing as expected.
If this information could be added to the best practices pages, a relevant tutorial, or a new section of the documentation (containing hap.py/som.py benchmarking information and run times for a set of tested tools and/or workflows), that would be great!
Please let me know what you think, and if you have any questions.
Hello there,
Documentation request
It would be awesome if each major/minor release of GATK contained benchmarking results run against a truth set. For calling short germline variants, you could evaluate precision and recall using one of the NIST samples (i.e. HG002) and for calling short somatic variants you could use the SEQC2 tumor-normal pair.
I would like to see how new releases of GATK and how your recommendations/best practices impact precision and recall against a known truth set. This transparency would help anyone using your tools decide if it's worth upgrading to a new release of GATK, or if their implementation of your recommended set of best practices is performing as expected.
If this information could be added to the best practices pages, a relevant tutorial, or a new section of the documentation (containing hap.py/som.py benchmarking information and run times for a set of tested tools and/or workflows), that would be great!
Please let me know what you think, and if you have any questions.
Best regards, @skchronicles