22.11 Metrics, documentation, questions

HelenaCornu commented 1 year ago

Gene burden

From #2805, my understanding is: 22 new pieces of evidence for 9 genes, including the 3 novel associations did not show statistical significance in the meta-analysis. Is this correct?

Olesya's new model

What has been updated in the new model? (What is important for users to know?)

Variant functional consequence

Based on @ireneisdoomed's metrics in #2640, I made:

Variant functional consequence

What do you think? Is there anything I should change? Should I show the data differently?

Documentation

[ ] Gene Burden docs need to be updated

HelenaCornu commented 1 year ago

Question — comparing the metrics (22.11.1 pipeline run), I see we've lost evidence from CRISPR/Behan et al. (1838 in 22.11 vs 1846 in 22.09). How come?

We've also gained evidence in ClinGen, EVA (genetic and somatic), Genetics (I thought we integrated the new data in 22.09), Reactome, and lost evidence from G2P, Expression Atlas, IMPC, Uniprot (Literature, Variants) — I'm just curious about where this is coming from :)

@ireneisdoomed I saw your comment on Slack re:22.09 metrics — is this related to that issue?

ireneisdoomed commented 1 year ago

Re Gene Burden:

It is not exactly like that. So this paper brings evidence of association for 9 genes, of which B3GNT3, AUNIP, ADH5, TUBA1B, OR1G1, CAPN10, and TREML1 of which 9 are associations that were not supported until now.

For most of the genes, the signal could be replicated in multiple meta-analyses that varied mainly on the predicted impact of the qualifying variants. However, 3 of them didn't show exome-wide significance in the meta-analyses, so for those, we only report the test and the cohort where stat. significance was seen.

If you look at this comment, the ones in blue were replicated in the meta-analysis and the ones in orange weren't. https://github.com/opentargets/issues/issues/2805#issuecomment-1315473164

ireneisdoomed commented 1 year ago

Re Functional consequences: I like the visualisation a lot! The numbers have changed a little bit since I opened the ticket, they were also based on raw evidence I believe.

Here's the updated values:

# Gene2Phenotype
+------------------------------+-----+
|variantFunctionalConsequenceId|count|
+------------------------------+-----+
|                    SO_0002317| 2002|
|                    SO_0002318|  740|
|                    SO_0002220|  188|
|                          null|   58|
|                    SO_0001566|    8|
|                    SO_0002315|    6|
|                    SO_0001622|    5|
+------------------------------+-----+

# Orphanet
+------------------------------+-----+
|variantFunctionalConsequenceId|count|
+------------------------------+-----+
|                          null| 4850|
|                    SO_0002054| 1112|
|                    SO_0002053|  194|
+------------------------------+-----+

ireneisdoomed commented 1 year ago

Re: Reasons to stop classifier

I don't know how much into detail we want to go on this, but I think it would be nice to mention that the classifier algorithm has been improved so that the overall accuracy has increased by 20%.

The main idea of the improvement is that the model is now trained on the basis that multiple classes can be assigned for the same text. Each possible class is a feature for the classifier so that it can learn patterns of what is the correlation between classes.

For example, if the reason to stop was "Lack of enrolment due to the COVID-19 situation". This could be assigned to "Insufficient_Enrollment" or "COVID-19". Before we would feed the model separately for each class, so that the next time that it saw a similar scenario, the output would be both classes. What we have recently implemented is the fact that classes are not mutually exclusive, so we feed the model once saying that both classes are true for that reason. This way the algorithm will hopefully identify patterns where classes frequently co occur and where one class can be more important than the other. So it now learns to predict a class (or multiple) in the context of the others, which is great.

Technically speaking, to adapt to this we had to change the function that outputs the probabilities for each class, switching from a softmax function (typically used when classes are independent and you want to get the top scoring one), to a sigmoid function, where we output a probability for each of the classes. The probability threshold that we've seen shows good results without introducing too much noise is 0.3. So if the predicted probability is >0.3, we will display that class.

In this comment you have an example where we see such improvement in accuracy https://github.com/opentargets/issues/issues/2794#issuecomment-1309925767

ireneisdoomed commented 1 year ago

Re: metrics

OT Genetics. The difference you see is due to the 22.09 metrics not being up to date, so nothing to report here.
Uniprot. They have updated their curation so that their evidence is pointing to a more granular term in EFO.
EVA. They have been working on their manual curation process, bringing new 10,000 new RCVs as compared to 22.09, but also going over the previous curation and improving it.

The rest of the numbers that you see dancing are a byproduct of changes in the ontology, I believe.

opentargets / issues