monarch-initiative / helpdesk

The Monarch Initiative Helpdesk
BSD 3-Clause "New" or "Revised" License
7 stars 0 forks source link

Feedback form - Ray Stefancsik - Hi team, I am looking at this page https://monarchinitiat... #129

Open monarch-issues-tracker[bot] opened 1 week ago

monarch-issues-tracker[bot] commented 1 week ago

Name Ray Stefancsik

Email stefancsik@gmail.com

GitHub Username @rays22

Details Page: /feedback Browser: Firefox 126.0 Device: Apple Macintosh OS: Mac OS 10.15 Engine: Gecko 126.0

Hi team, I am looking at this page https://monarchinitiative.org/HP:0006501?associations=biolink:GeneToPhenotypicFeatureAssociation and I am wondering where these Gene to Phenotype associations come from? What is the evidence for these gene-phenotype associations? Thanks, Ray

iimpulse commented 1 week ago

Hi @rays22,

These annotations infer from the gene to disease associations. So where a disease is associated with a phenotype, and that disease is associated with a gene; the third connection is made.

Hope this helps. If you have have feedback or questions I am all ears!

rays22 commented 1 week ago

Thanks @iimpulse for the prompt response. The information you provided helps me to pinpoint some problems.

One minor problem is that there is no provenance for the "Gene to Phenotype" associations on these pages. It is a minor problem compared with the next one, because the lack of provenance is an error of omission.

However, the major problem is that many of the 'has phenotype' associations between specific genes and specific phenotypic features are false and misleading. I was hoping that there exists somewhere some reference to scientific studies that validate these associations. However, I have checked several of these gene-phenotype associations and they are not true. The reason for this is that a "disease" is a loose collection of phenotypic features, in other words, a spectrum of phenotypic features. The genes causally implicated in the disease are correct associations. However, very often it is the case that none of the alleles of an individual genes engender a specific phenotypic feature of the disease spectrum, therefore the assumption that every single gene that is associated with the disease is also associated with every single phenotypic feature is wrong and very misleading. There should be information on each of these associations if they have been either

  1. validated by experimental studies, or
  2. inferred without experimental evidence.

At some degree, every gene is associated with every disease, but it is important to provide valid information on implied direct gene-phenotype associations.

cmungall commented 1 week ago

assumption that every single gene that is associated with the disease is also associated with every single phenotypic feature is wrong and very misleading

This is indeed an incorrect assumption - we should avoid implying this! Is there a place in the documentation that implies this @rays22 - if so we should fix it.

Note this assumption doesn't even apply without the join - variable penetrance etc

I agree completely that we should include more provenance. The g2p table on the HPO site is just the join of g2d and d2p, I don't see why we can't propagate more metadata from both across and include here https://hpo.jax.org/data/annotation-format. We've also discussed including evidence metadata from clingen and gencc in g2d, this could also be propagated across.

rays22 commented 1 week ago

This is indeed an incorrect assumption - we should avoid implying this! Is there a place in the documentation that implies this @rays22 - if so we should fix it.

I have not been able to find the documentation on this and I may have misunderstood the response to my query:

These annotations infer from the gene to disease associations. So where a disease is associated with a phenotype, and that disease is associated with a gene; the third connection is made.

iimpulse commented 1 week ago

While I completely agree, you can't directly relate a gene to phenotype without taking into account something like variable penetrance. Obtaining GWAS information from cohorts for each disease and their traits I would imagine is almost non-existent. Again, there is no definitive source for phenotype to gene. @pnrobinson

Our first change here should be to add the predicate or metadata 'inferred' to these annotations to make it apparent that these have no direct linkage to evidence.

Additionally, moving forward with evidence metadata from clingen and gencc would provide us with information about g2d, but not about p2g.

Any suggestions on future implementation is greatly appreciated.

pnrobinson commented 1 week ago

@rays22 -- the HPOAs are all rare disease , the connection between gene and disease is much clearer than with common, complex / polygenic.

rays22 commented 6 days ago

@pnrobinson ,

I thought "Rare disease" is a statistical category in epidemiology. Rare genetic disease is related to the incidence of spontaneous mutations in human populations, but I am quite certain that the basic rules of scientific method and genetics still apply. Below I am hoping to clarify that the issue I have raised is unrelated to GWAS. "Penetrance" is also not the issue here, because my points apply to any non-zero penetrance phenotypic feature. By the way, how do you differentiate 0% penetrance of a phenotypic feature from any non-causal gene-allele relationship? They look the same to me.

Evaluating evidence from previous experiments is very important when making a decision to support or discard a hypothesis, for example, the hypothesis of "a(ny) mutation in gene X cause phenotype Y". In the context of the current ticket, if there is any experimental evidence that at least one mutant allele in gene X is reasonably confidently implicated in a causal relationship with phenotypic feature P than it should be clearly differentiated from other relationships where there is no evidence for such a relationship.

If gene X and gene Y alleles are associated with disease D, and disease D has a spectrum of phenotypic features P, Q, R, it is still possible that, for example, gene X alleles are never causally associated with all the phenotypic features (P, Q, R) of disease D. There are documented cases when there is extensive overlap between the gene-allele and phenotypic feature associations, but only a subset of phenotypic features of the disease can be associated with each associated gene (e.g. P and Q, but not R). As an example, could you kindly show evidence that TNNT3 has phenotype Aplasia/Hypoplasia of the radius? I have checked the relevant papers and found no evidence for this, and none of the authors of those papers ever suggested it existed. If this is a hypothetical association than it should be clearly flagged as such in the table.

Please, do not underestimate the risk of loosing trust in Monarch (and HPO) if you not clearly distinguish human gene - clinical phenotype associations based on evidence from associations that are only hypothetical.

pnrobinson commented 5 days ago

@rays22 I do not understand what the issue is. There are no intentionally hypothetical annotations in HPOA. One problem is that the import of the HPOA data into the Monarch site is not 100% and the numbers do not match. It would be much better to use the HPO API, and if there is some mistake in the HPOAs to report it.

In any case, I do not see an annotation for TNNT3 for Aplasia of radius here: https://hpo.jax.org/browse/disease/OMIM:618435

rays22 commented 5 days ago

In any case, I do not see an annotation for TNNT3 for Aplasia of radius here: https://hpo.jax.org/browse/disease/OMIM:618435

That is excellent news, @pnrobinson . Thank you. Here is a screenshot to illustrate the issue on the Monarch page: TNNT3_radius

rays22 commented 5 days ago

I do not understand what the issue is. There are no intentionally hypothetical annotations in HPOA.

• FYI Although this is a Monarch browser ticket, here is something from another website that may prove to be useful to fix the source of the Monarch issue. Select 'Gene Associations', then scroll down in the table here. You will find that TNNT3 is listed as a gene associated with Aplasia/Hypoplasia of the radius HP:0006501 on this page too. TNNT3-radius-jax

pnrobinson commented 5 days ago

This annotation is coming from ORPHANET, which annotated on the basis of expert opinion and does not provide PMID citations. This particular annotation may be an error because it is reported as frequent by Orphanet but is not reported by OMIM or AFAICS in PubMed, however, there is a source: https://www.orpha.net/en/disease/sign/1147 This is a common problem with rare disease--pubmed is not necessarily comprehensive. On the other hand, expert opinion is harder to track. For this reason, the HPO team's annotations are based only on PMIDs. Both options are valid, I would say. However, the Monarch app should keep track of the provenance. We should discuss via zoom. This is obvious on the HPO webpage but does not appear on the Monarch webpage.