ga4gh / vrs

Extensible specification for representing and uniquely identifying biological sequence variation
https://vrs.ga4gh.org
Apache License 2.0
80 stars 32 forks source link

What makes it special/different/better than SPDI, VCF, and others? #305

Closed larrybabb closed 3 years ago

larrybabb commented 3 years ago

This question was raised by a registrant to the June 2, 2021 GHIF VRS Webinar. Here are the recording and slides from the Webinar

During the course of the webinar we discussed the differences between VRS and HGVS, VCF and we discussed the differences with SPDI. I would recommend view the webinar for a more detailed explanation.

I invite others to add details here.

The key differences and benefits of VRS to these are:

  1. VRS is being designed as an informational model that is designed as atomic building blocks that can be composed into higher order variant representations. It is designed for the primary function of precise computational data exchange.
  2. VRS is also extensible. It is not limited to simple SNVs, DelIns and any subset of variation and such can be used as a standard that will grow with the types of variation that are often limited by other methods, nomenclatures and authorizing registries (SPDI, VCF and HGVS)
  3. VRS is not limited to genomic sequence, but any type of sequence (genomic, transcript, protein).
  4. VRS is not limited to sequence based variation (cytobands, systemic expression, genetic features)
  5. SPDI is only about alleles and precise genomic variation, SPDI's nomenclature is built on VOCA (variant overprecision correction algorithm) as specified by NCBI. VRS is built on VOCA as well for the types of variation that fall within its domain.
  6. VCF is genomic only. VCF is a file format. VCF is primarily designed for high-volume, compact variant calls. VCF is not designed to be extensible in the same way as VRS to support much broader representations of variation independent of samples or cohorts. VCF does not normalize the small precise SNVs and DelInss using the same VOCA based normalization.
  7. HGVS is a nomenclature. HGVS is designed primarily for human-readability not computational identification. HGVS is not applied consistently in reporting, literature, and databases even though there has been great strides to provide tooling to validate HGVS syntax. HGVS does not normalize variation using VOCA. Several HGVS expressions can represent the same variant. VRS is not designed to be human-readable (we have started designing implementation guidance for wrapping VRS representations in Value Object Descriptors to allow exchange systems to add human-readable and useful attributes that improve the productivity of data exchange contracts involving variation - see VRSATILE).

That's a braindump of the items I have for now. I hope this is informative.

reece commented 3 years ago

Moved to FAQ. Closing.