Closed luederm closed 1 year ago
Thanks for pointing that out to me. Some calls outside the high confidence regions were left in those files. I'll make a note of that in README and release corrected versions of those high-confidence-sSNV/INDEL files.
SEQC2 has an update with README.md
there explaining the differences (and why):
https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/seqc/Somatic_Mutation_WG/release/latest/
I downloaded high-confidence_sINDEL_in_HC_regions_v1.2.vcf.gz and high-confidence_sINDEL_in_HC_regions_v1.2.vcf.gz from FTP but noticed that some of the variants are not in the regions defined by High-Confidence_Regions_v1.2.bed. This led to issues when I compared my results (after filtering using the supplied BED file) to the HC reference call set.