lgmgeo / AnnotSV

Annotation and Ranking of Structural Variation
GNU General Public License v3.0
216 stars 34 forks source link

Assessment of ClinGen criteria 2F/4O #215

Closed milena-andreuzo closed 7 months ago

milena-andreuzo commented 8 months ago

Hello! I was analyzing on AnnotSV 3.2.3 using its docker image (we've been using this version and haven't validated new ones yet) and came across a ClinGen criteria that I could not understand.

Some variants as an example:

chr start end sv_type annotsv ranking score criteria
chr1 93823040 93825244 \ -0.9 1A (cf Gene_count, RE_gene, +0.00);2F/4O (cf B_loss_source, -1.00);3A (1 gene, +0.00);5G (BCAR3, +0.10)
chr14 73532264 73556852 \ -0.9 1A (cf Gene_count, RE_gene, +0.00);2F/4O (cf B_loss_source, -1.00);2B (cf po_P_loss_source, HI and OMIM_morbid, +0.00);3A (2 genes, +0.00);5G (ACOT1, +0.10)
chrX 32969207 32970207 \ -0.9 1A (cf Gene_count, RE_gene, +0.00);2F/4O (cf B_loss_source, -1.00);3A (1 gene, +0.00);5G (DMD, +0.10)
chr5 78814032 78816032 \ -0.7 1A (cf Gene_count, RE_gene, +0.00);2F/4O (cf B_loss_source, -1.00);2B (cf po_P_loss_source, HI and OMIM_morbid, +0.00);3A (1 gene, +0.00);5H (ARSB, +0.30)
chr1 109644578 109647981 \ -1.6 1B (cf Gene_count, RE_gene, -0.60);2F/4O (cf B_loss_source, -1.00);2B (cf po_P_loss_source, HI and OMIM_morbid, +0.00);3A (0 gene, +0.00);5F (+0.00)

I didn't get why 2F and 4O are assigned but counted only once with 2F/4O (cf B_loss_source, -1.00). I couldn't find any material or reference on ClinGen/ACMG articles and platforms that relates specifically these two rules. For the LOSS calculator, there is this header that you mention on the code:

Section 4: Detailed Evaluation of Genomic Content Using Published Literature, Public Databases, and/or Internal Lab Data (Skip to Section 5 if either your CNV overlapped with an established HI gene/region in Section 2, OR there have been no reports associating either the CNV or any genes within the CNV with human phenotypes caused by loss of function (LOF) or copy number loss)

It says "Skip to Section 5 if either your CNV overlapped with an established HI gene/region in Section 2" so I get the part where section 4 is skipped altogether if the "CNV overlapped with an established HI gene/region in Section 2", but criteria 2F is about "Overlap with ESTABLISHED benign genes or genomic regions", not established haploinsufficient genes or genomic regions (2A-2E).

Why do you not apply 4O if 2F is applied?

Your code snippet from tag v3.2.3 (on the current release this still happens).

lgmgeo commented 8 months ago

I am out of the office. I will be back on March 11 and will respond to your issue at that time.

lgmgeo commented 8 months ago

Hi,

Thank you for this interesting debate. I'm definitely not an expert on SV ranking, so the discussion is open.

First, we need to keep in mind that AnnotSV estimates the 2F and 4O criteria from the same data sources (the benign SV dataset): image image And it is to notice that common population SV and established benign SV are merged in the AnnotSV benign SV dataset. (Please, see the README for more information)

Then, as a reminder:

The AnnotSV comprehensive and detailed scoring guidelines are available in the Scoring_Criteria_AnnotSV_latest.xlsx file. See Table1 for loss SV (DEL) and Table2 for gain SV (DUP).

Regarding on your questioning:

In my opinion:

Of course, this AnnotSV score is an adaptation of the work provided by the joint consensus recommendation of ACMG and ClinGen (Riggs et al., 2020). Not the exact implementation.

Can you share your feelings about all this? Does this seem correct to you after explanation?

Best,

Véronique

lgmgeo commented 7 months ago

Any news?

milena-andreuzo commented 7 months ago

Hi Véronique, thanks for the explanation and I'm sorry about the late response -- my mailbox was full and I didn't get the notice for your reply. Discussing internally with my colleagues we got to the conclusion you now give me. Especially this:

if 2F is validated, then 4O is automatically validated (both with B_loss_coord) if 2F is not validated, then 4O could be validated anyway (with po_B_loss_allG_coord) That's why I choose not to apply 4O if 2F is applied.

I'm glad you confirmed it and to us it makes sense.

Thank you,

Milena