clingen-data-model / clinvar-ingest-reports

ClinGen generates several google sheet based reports from the ClinVar ingested data that originates from the Broad BigQuery data.
0 stars 0 forks source link

One-off ClinVar report for McGurk/Socarras CS project #89

Closed larrybabb closed 7 months ago

larrybabb commented 7 months ago

Kayla S and Kathryn M are working with Heidi on a CS project. They requested the following report from ClinVar.

https://the-tgg.slack.com/archives/C06KPQN2ZNW/p1708535191518009 @Larry Babb (he/him) could we use Kayla's google sheet to pull out the the VCV-level classification for all star level variants in those genes in two groups:

  1. P+LP+P/LP
  2. VUS+Conflicting (but includes a LP or P report, e.g., VUS-LP and not VUS-LB) Where the output has: Variant locus [format CHR:POS], REF, ALT, ClinVar Classification, ClinVar ID And these too if easy: Gene Symbol, Consequence

-- The following dialog ensued with additional specifications question 1: by "star level variants" do you mean any variant that has at least one star for the vcv-level classification? (so all 1,2,3 and 4 star, but NOT 0 star?

question 2: do you want the GRCh38 position in the variant locus or 37?

question 3: if there is more than one (molecular) consequence (due to multiple transcripts) do you want them all? Please verify that the final format data: CHR:POS, REF, ALT, Classification, ClinVar Variant ID (or VCV iD?), Gene Symbol, Molecular Consequence (just one) so an example for clinvar variant 10 would look as follows (assuming grch38 positions) "6:26090951", "C", "G", "Pathogenic/Pathogenic, low penetrance; other", "10", "HFE", "missense"

question 4: should I skip over any variants that include more than one gene? If not, should I use the rule that at least one of the genes in the list must be associated? (Heidi often has me just address the single-gene variants)

Kathryn McGurk @Heidi have added answers below, let us know if you'd change anything before @Larry Babb (he/him) extracts: Yep >=1; needs a disease criteria GRCh38 No just one: canonical/most damaging Yep to: CHR:POS, REF, ALT, Classification, I think variation ID [just some way of checking reports], Gene Symbol, Molecular Consequence (just one) I think single-gene variants if that's the usual suggestion In your example, looks great but I think we'd exclude variant 10 because of "low penetrance"? - a Q for @Heidi Thanks Larry!

eidi Rehm :spiral_calendar_pad: 5 days ago Yes, agree. Let’s exclude any with low penetrance. But for 3, I’d use worst consequence of MANE Select or MANE Plus Clinical only

larrybabb commented 7 months ago

Shipped this to the cs_project group on 3/2/2024