ga4gh / vrs

Extensible specification for representing and uniquely identifying biological sequence variation
https://vrs.ga4gh.org
Apache License 2.0
80 stars 34 forks source link

Admixture values in CopyNumberCount and other forms of variation #445

Closed Mrinal-Thomas-Epic closed 7 months ago

Mrinal-Thomas-Epic commented 1 year ago

CopyNumberCount has copies as either a Number, IndefiniteRange and DefiniteRange. Number has a value property of int, but many times copy numbers are reported as floating point numbers in cancer contexts. Additionally, the property description is The integral number of copies of the subject in a system. Is there a reason VRS doesn't allow floating point copy numbers?

Its worth noting, IndefiniteRange and DefiniteRange do currently allow floating point values.

github-actions[bot] commented 10 months ago

This issue was marked stale due to inactivity.

ahwagner commented 9 months ago

In CNV reporting tools, particularly in the cancer domain, a floating point "copy number" value may be reported. However, this number typically conveys something much more semantically complex than a VRS CopyNumberCount variant does; it is usually an average copy number value across an admixture of healthy and disease cells with differing CopyNumberCount variants. Usually, these values are used to drive CopyNumberChange assessments in cancer samples. Similar challenges exist for Allele objects, where an Allele may be present/absent in a cellular admixture at a certain frequency.

I think we should discuss ways we might represent variants that are present in varying degrees across cellular admixtures.

Mrinal-Thomas-Epic commented 9 months ago

I agree that that two concepts are closely related, but distinct. We should discuss where it makes sense to create a spec for the floating point "copy number" value.

Should the indefinite range and definite range used by the VRS CopyNumberCount be restricted to integers? What is the use case for the indefinite and definite range for VRS CopyNumberCount? That feels more conceptually similar to the floating point "copy number" value, but I may not understand all the use cases

EDIT: I see this change had already been made in 2.x