clingen-data-model / genegraph

Presents an RDF triplestore of gene information using GraphQL APIs
5 stars 0 forks source link

Implement CvC VCV aggregation #185

Closed theferrit32 closed 8 months ago

theferrit32 commented 3 years ago

This should be done on ingest of the clinvar-combined stream. Because individual clinical assertions (SCVs) are always ingested before aggregate assertions (VCVs) within a single release because of how the upstream process orders the messages, the re-aggregation of a VCV into split aggregate assertions for each classification context can be done whenever a variation_archive message is received, and selecting the non-deleted most recent versions of any SCV that is a member of that variation archive.

Contribution field should probably mirror the upstream release date in the message, not be the date the record was physically created by the transformer. The contribution agent should be set to clingen, or something similar. Using the same release date of the upstream VCV record will ensure a consistent release_date ordering in sparql queries, particularly after genegraph data is reloaded from upstream data.

KelseaChang5 commented 2 years ago

December 7th 2021 Triage: Renamed from Generate restructured aggregate clinical assertions based on ClinGen aggregation rules for ClinVar SCVs to Implement CvC VCV aggregation

theferrit32 commented 8 months ago

closed as wont implement

Will revisit later and can recreate ticket as needed