griffithlab / civic-client

Web client for CIViC: Clinical Interpretations of Variants in Cancer
MIT License
50 stars 28 forks source link

Add sentence to releases page to explain variant number discrepancy between VCF and TSV #1490

Closed susannasiebert closed 3 years ago

susannasiebert commented 4 years ago

"The VCFs only include variants with complete coordinates and, thus, the number of variants will be lower in the VCFs compared to the number of variants in the TSV."

or something to that effect

kkrysiak commented 4 years ago

I don't know the specifics of this. @ahwagner @obigriffith or @malachig can you assist in providing some text to add to the Data Releases page?

malachig commented 3 years ago

In order to comply with the VCF specification, the VCFs can only include variants with complete coordinates. By contrast, the TSV variants file may contain variants with coordinates that have not been fully curated in CIViC. Additional variants are of types that can not be properly represented in VCF format. Thus, the number of variants will be lower in the VCFs compared to the number of variants in the TSV.

obigriffith commented 3 years ago

Malachi's text sounds good