clingen-data-model / genegraph

Presents an RDF triplestore of gene information using GraphQL APIs
5 stars 0 forks source link

VRS 2.0 Schema Modifications for VariationDescriptors #761

Closed toneillbroad closed 10 months ago

toneillbroad commented 1 year ago

Step 1: Drop the "canonical_variation" item in the descriptor, and move all of its contained items ("id", "type", and "canonical_context") up a level.

Step 2: Change the type from "CanonicalVariationDescriptor" to "CanonicalVariation"

Step 3: Add a "digest" item to the top level containing just the digest portion of the CanonicalVariation "id" item at the top level.

Step 4: Change to outer level "id" field (in the form of ["id": "cgterms:VariationDescriptor_567576.2022-03-30"] to be of the form ["identifier": "clinvar:567576.2022-03-30]

There are other annotations contained in the sample mockup Larry provided: `{

"id": "ga4gh:CLV.yi7FnYRVHoVQzRebAXh_CQ0Syy8O_ezL",
"type": "CanonicalVariation",

"label": "NM_004360.5(CDH1):c.376_382dup (p.His128fs)",
"description": "NM_004360.5(CDH1):c.376_382dup (p.His128fs)",

"canonical_context": {

    "id": "ga4gh:VA.e64Sf4vI4oQGSTzIL1lWlm4hDnGniJHL",
    "type": "Allele",

    -- this allele is also decoratable... but we are not showing that here...

    "location": {

        "id": "ga4gh:SL.HQWgJ_9cBc9_T0hyAq1z3wdIjgrvivdo",
        "type": "SequenceLocation",

        -- this location is also decoratable... but we are not showing that here...

        "start": {
            "type": "Number",
            "value": 68801874
        },
        "end": {
            "type": "Number",
            "value": 68801888
        },

        -- we also need to convert this sequence_id into a 'sequence' object so that it too can be decoratable... TBD
        "sequence_id": "ga4gh:SQ.yC_0RBj3fgBlvgyAuycbzdubtLxq-rE0"

    },
    "state": {
        "type": "LiteralSequenceExpression",
        "sequence": "CCGCCCCCCGCCCCCCGCCCC"
    }

},

-- we need a new attribute or method for tagging equivalent xrefs, in clinvar that would involve a versioned clinvar variation id.
"identifier": "clinvar:567576.2022-03-30",
"xrefs": [
    "https://www.ncbi.nlm.nih.gov/clinvar/567576",
    "https://identifiers.org/clinvar:567576"
],

-- we may need to revisit the VariationMember structure to make sure these are expressable as Variations... TBD
"members": [
    {
        "type": "VariationMember",
        "expressions": [
            {
                "type": "Expression",
                "syntax": "hgvs.g",
                "value": "NG_008021.1:g.69584CCGCCCC[3]"
            }
        ]
    },
    {
        "type": "VariationMember",
        "expressions": [
            {
                "type": "Expression",
                "syntax": "hgvs.c",
                "value": "NM_001317184.2:c.376_382dup"
            }
        ]
    },
    {
        "type": "VariationMember",
        "expressions": [
            {
                "type": "Expression",
                "syntax": "hgvs.c",
                "value": "NM_004360.5(CDH1):c.369_375CCGCCCC[3]"
            }
        ]
    },
    {
        "type": "VariationMember",
        "expressions": [
            {
                "type": "Expression",
                "syntax": "hgvs.g",
                "value": "NC_000016.9:g.68835778CCGCCCC[3]"
            }
        ]
    },
    {
        "type": "VariationMember",
        "expressions": [
            {
                "type": "Expression",
                "syntax": "hgvs.p",
                "value": "NP_001304113.1:p.His128fs"
            }
        ]
    },
    {
        "type": "VariationMember",
        "expressions": [
            {
                "type": "Expression",
                "syntax": "hgvs.c",
                "value": "NM_001317186.2:c.-1451CCGCCCC[3]"
            }
        ]
    },
    {
        "type": "VariationMember",
        "expressions": [
            {
                "type": "Expression",
                "syntax": "hgvs.c",
                "value": "NM_001317185.2:c.-1247CCGCCCC[3]"
            }
        ]
    },
    {
        "type": "VariationMember",
        "expressions": [
            {
                "type": "Expression",
                "syntax": "spdi",
                "value": "NC_000016.10:68801874:CCGCCCCCCGCCCC:CCGCCCCCCGCCCCCCGCCCC"
            }
        ]
    },
    {
        "type": "VariationMember",
        "expressions": [
            {
                "type": "Expression",
                "syntax": "hgvs.p",
                "value": "NP_004351.1:p.His128fs"
            }
        ]
    },
    {
        "type": "VariationMember",
        "expressions": [
            {
                "type": "Expression",
                "syntax": "hgvs.c",
                "value": "NM_004360.5:c.376_382dup"
            }
        ]
    },
    {
        "type": "VariationMember",
        "expressions": [
            {
                "type": "Expression",
                "syntax": "hgvs.g",
                "value": "NC_000016.10:g.68801875CCGCCCC[3]"
            }
        ]
    },
    {
        "type": "VariationMember",
        "expressions": [
            {
                "type": "Expression",
                "syntax": "hgvs.g",
                "value": "LRG_301:g.69584CCGCCCC[3]"
            }
        ]
    }
]

}`

This is a sample file that is in the current form (i.e. Version 1.0 form) - but it is not the variant.

{ "description": "NM_004360.5(CDH1):c.376_382dup (p.His128fs)", "type": "CanonicalVariationDescriptor", "xrefs": [ "https://www.ncbi.nlm.nih.gov/clinvar/567576", "https://identifiers.org/clinvar:567576" ], "canonical_variation": { "id": "ga4gh:CLV.yi7FnYRVHoVQzRebAXh_CQ0Syy8O_ezL", "type": "CanonicalVariation", "canonical_context": { "id": "ga4gh:VA.e64Sf4vI4oQGSTzIL1lWlm4hDnGniJHL", "type": "Allele", "location": { "id": "ga4gh:SL.HQWgJ_9cBc9_T0hyAq1z3wdIjgrvivdo", "type": "SequenceLocation", "sequence_id": "ga4gh:SQ.yC_0RBj3fgBlvgyAuycbzdubtLxq-rE0", "start": { "type": "Number", "value": 68801874 }, "end": { "type": "Number", "value": 68801888 } }, "state": { "type": "LiteralSequenceExpression", "sequence": "CCGCCCCCCGCCCCCCGCCCC" } } }, "label": "NM_004360.5(CDH1):c.376_382dup (p.His128fs)", "id": "cgterms:VariationDescriptor_567576.2022-03-30", "members": [ { "type": "VariationMember", "expressions": [ { "type": "Expression", "syntax": "hgvs.g", "value": "NG_008021.1:g.69584CCGCCCC[3]" } ] }, { "type": "VariationMember", "expressions": [ { "type": "Expression", "syntax": "hgvs.c", "value": "NM_001317184.2:c.376_382dup" } ] }, { "type": "VariationMember", "expressions": [ { "type": "Expression", "syntax": "hgvs.c", "value": "NM_004360.5(CDH1):c.369_375CCGCCCC[3]" } ] }, { "type": "VariationMember", "expressions": [ { "type": "Expression", "syntax": "hgvs.g", "value": "NC_000016.9:g.68835778CCGCCCC[3]" } ] }, { "type": "VariationMember", "expressions": [ { "type": "Expression", "syntax": "hgvs.p", "value": "NP_001304113.1:p.His128fs" } ] }, { "type": "VariationMember", "expressions": [ { "type": "Expression", "syntax": "hgvs.c", "value": "NM_001317186.2:c.-1451CCGCCCC[3]" } ] }, { "type": "VariationMember", "expressions": [ { "type": "Expression", "syntax": "hgvs.c", "value": "NM_001317185.2:c.-1247CCGCCCC[3]" } ] }, { "type": "VariationMember", "expressions": [ { "type": "Expression", "syntax": "spdi", "value": "NC_000016.10:68801874:CCGCCCCCCGCCCC:CCGCCCCCCGCCCCCCGCCCC" } ] }, { "type": "VariationMember", "expressions": [ { "type": "Expression", "syntax": "hgvs.p", "value": "NP_004351.1:p.His128fs" } ] }, { "type": "VariationMember", "expressions": [ { "type": "Expression", "syntax": "hgvs.c", "value": "NM_004360.5:c.376_382dup" } ] }, { "type": "VariationMember", "expressions": [ { "type": "Expression", "syntax": "hgvs.g", "value": "NC_000016.10:g.68801875CCGCCCC[3]" } ] }, { "type": "VariationMember", "expressions": [ { "type": "Expression", "syntax": "hgvs.g", "value": "LRG_301:g.69584CCGCCCC[3]" } ] } ] }

theferrit32 commented 1 year ago

Standup 4/25: will leave this and related tasks like also restructuring VA-spec Statement and Statement sub-schemas like Condition, Disease, Phenotype, ConditionDescriptor, and Proposition to later after the ideas are more fleshed out in the next few weeks

toneillbroad commented 10 months ago

This is an outdated version or VRS. Closing.