clingen-data-model / clinvar-submitter

Application for transforming messages formatted according to the data model for ClinVar submission
Eclipse Public License 1.0
1 stars 0 forks source link

Periodic generation of file used to supplement non-novel clinvar submissions with SCV ID #7

Closed toneillbroad closed 3 years ago

toneillbroad commented 3 years ago

We need to periodically generate a file that will be included in the clinvar-submitter docker container, and will be used to supplement all non-novel clinvar submissions with the proper SCV ID.

The file will contain three fields: Local ID, Variation ID, submitter and SCV ID. The format is to be determined by the developer. In reality the only fields that are needed are the Variation id and SCV ID as the submitter is not currently used by the code.

A single entry in the current file looks like this: { "SubmitterAbbr": "HL", "Lookup": "HL2353", "LocalID": "", "VariationID": 2353, "ClinicalSignificance": "Pathogenic", "DateLastEvaluated": 43368, "Description": "The allele frequency of the p.Cys1447GlnfsX29 variant in the USH2A gene is 0.0009% (1/111250) of European (Non-Finnish) chromosomes by the Genome Aggregation Database (http://gnomad.broadinstitute.org), which is a low enough frequency to award PM2 based on the thresholds defined by the ClinGen Hearing Loss Expert Panel for autosomal recessive hearing loss (PM2). The p.Cys1447GlnfsX29 variant is predicted to cause a premature stop codon in biologically-relevant-exon 20 of 72 that leads to an absent protein in a gene in which loss-of-function is an established mechanism (PVS1). This variant has been detected as compound heterozygous with p.Cys759Phe or p.Glu767SerfsX21 in six Usher syndrome probands, and as homozygous in eight Usher syndrome probands (PM3_VeryStrong; PMID: 9624053, 15325563, 18641288, 18665195, 20440071). The p.Cys1447GlnfsX29 variant in USH2A has been reported to segregate with hearing loss in at least 2 family members (PP1_Moderate; PMID: 20440071, 9624053). At least one patient with a variant in this gene displayed features of mild to severe hearing loss and retinitis pigmentosa (PP4; PMID: 9624053, 15325563, 18641288, 18665195, 20440071). In summary, this variant meets criteria to be classified as pathogenic for autosomal recessive Usher syndrome based on the ACMG/AMP criteria applied, as specified by the Hearing Loss Expert Panel: PM2, PVS1, PM3_VeryStrong, PP1_Moderate, PP4.", "SubmittedPhenotypeInfo": "Orphanet:ORPHA886", "ReportedPhenotypeInfo": "C0271097:Usher syndrome", "ReviewStatus": "reviewed by expert panel", "CollectionMethod": "curation", "OriginCounts": "germline:na", "Submitter": "ClinGen Hearing Loss Expert Panel", "SCV": "SCV000840529.1", "SubmittedGeneSymbol": "USH2A", "ExplanationOfInterpretation": "-" }

All the new file needs are the highlighted fields above : Local ID, VariationID and SCV

The file is to be generated on a periodic basis, preferably weekly to coincide with weekly ClinVar releases, and through a mechanism to be determined by the developer (cron job, cloud scheduler, etc.). The file is to be automatically checked into the clinvar-submitter github project. This should cause the Argo CI/CD software to generate a new docker container containing the file, and that container automatically deployed to production.

toneillbroad commented 3 years ago

@larrybabb could you get me an example of this generated file so I can incorporate it into the clinvar-submitter code while the external tasks such as periodic generation and container build stuff is in progress?