geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
220 stars 40 forks source link

GO term specificity proposal 5, gene products as "activities" (viral processes) #12420

Closed ValWood closed 7 years ago

ValWood commented 8 years ago

Processes which refer to gene products as if they are "activities"

suppression by virus of host IRF9 activity by positive regulation of IRF9 localization to nucleus suppression by virus of host PKR activity by positive regulation of PKR nuclear localization

"IRF9 activity" is not an activity "PKR9 activity" is not an activity

@pgaudet @thomaspd @dosumis

rebeccafoulger commented 8 years ago

A note on the history of these: when we were working on the viral node, we did discuss the specificity of these terms and decided to include them as an exeption because the viral proteins had very specific host targets, and they were mainly needed for mapping to UniProt KWs where it's not possible to annotate using extensions.

ValWood commented 8 years ago

Can't UniProt create real GO annotations here instead of mappings? perhaps to one of the parents like "GO:0039653 suppression by virus of host transcription" or a term like "immune suppression by regulation of host transcription" if they want to capture immune suppression.

1. This one is particularly bad for numerous reasons...not least the definition doesn't fit the term name

suppression by virus of host IRF9 activity by positive regulation of IRF9 localization to nucleus

Any process in which a virus stops, prevents, or reduces the activity of host IRF9 (interferon regulatory factor-9) by promoting the nuclear accumulation of IRF9. For example, the reovirus mu2 protein promotes nuclear accumulation of host IRF9 by an as yet unconfirmed-mechanism.

The term says that the activity or IRF9 is affected, but the definitions refers to "nuclear accumulation", rather than the affect on transcription (IRF9 is a transcription factor)

  1. Is this annotation likely to be true for "IRF9" in all species? I.e. is the virus in question likely to infect all of these species?
  2. Many many viral proteins interact with host protein in some way so it seems bad practice to set a precedent of introducing terms which include the name of the host affected gene product.
dosumis commented 8 years ago

These should probably become more general terms, with the gene product named in the extension via has_regulation_target.

IRF9 is a transcription factor - so this should be something like

"suppression by virus of transcription factor activity* by regulation of transcription factor localization to nucleus" has_regulation_target(IFR9)

Completely formalising these terms is hard though.

ValWood commented 8 years ago

I can't even find an experimentally annotated version of human IRF9.....

rebeccafoulger commented 8 years ago

Several of the more specific terms (GO:0039582 or GO:0039560) don't have annotations so I'm not convinced they are useful anyhow, now AEs are more developed.

The higher level ones do have lots of annotations through UniProt KW:GO mappings. E.g. 'GO:0039580 suppression by virus of host PKR activity' has >37,000 IEA annotations- this target information couldn't be captured manually in AE because of the sheer volume.

You might want to get the SIB UniProt curators viewpoint on these- I don't know how much viral annotation is being done manually at the moment.

ValWood commented 8 years ago

Wow 37,000 IEA annotation from 3 experimental annotation via:

i) The herpes simplex virus 1 Us11 protein inhibits autophagy through its interaction with the protein kinase PKR. ii) Inhibition of the interferon-inducible protein kinase PKR by HCV E2 protein. iii) Hepatitis C virus nonstructural protein 5A modulates the toll-like receptor-MyD88-dependent signaling pathway in macrophage cell lines.

Shouldn't mappings like this be confined to closely related virus's (maybe they are but 37,000 annotations sounds a lot here because this is likely to be very context dependent......)

rebeccafoulger commented 8 years ago

They are closely related viruses- it's the different strains/subtypes that bulks up the numbers I think. It's done through KW mappings (not ISS/Compara etc) so the KW has been assigned to the TrEMBL entry, and thus the GO term can be annotated.

I'll email the UniProt virus annotators so they can monitor this post.

PhilLeMercier commented 8 years ago

Dear all,

About UniProt2GO: I am in the Swiss-Prot virus program, and we created virus GO terms with Jane and Rebecca and the help of other experts. (2 publications: [http://www.ncbi.nlm.nih.gov/pubmed/?term=25233094+or+26215368]) It took five years to get these GO term created. In the meantime we used UniProt Keywords to annotate. Once GO term were created, they were mapped to pre-existing UniProt Keyword, and this made a lot of IEA annotation through UniProt2GO.

Now, we are lucky to have a decent list of GO virus terms, and we annotated more than 300 IDA last year.

About IRF9: IRF9 is a key element for interferon signaling in vertebrate. [http://viralzone.expasy.org/all_by_protein/683.html]. This antiviral pathway is 100% conserved in vertebrates and targeted by many viruses. We participated creating GO:0039560 "suppression by virus of host IRF9 activity" , and this term is essential from a virologist point of view. Maybe "IRF9 activity" could be replaced by "IRF9 signaling".

On the other hand, I am don't know if we need of Child term "GO:0039561 suppression by virus of host IRF9 activity by positive regulation of IRF9 localization to nucleus", I agree it's weird.

Wow 37,000 IEA annotation from 3 experimental annotation

Actually this seems big, but one have to know that there are 600k HIV-1 isolate sequences, 450k InfluenzaA, etc... Many UniProt Keywords are propagated in TrEMBL to virus isolates within the same species. This makes sense, and produce a bunch of mapped entries. In turn,UniProt2GO creates a lot of IEA mapping. Nothing wrong in these annotations. On the other hand there are many incorrect GO annotation in Virus entries, but this is another debate...

I hope this helps

cheers

Philippe Le Mercier

PhilLeMercier commented 8 years ago

I can't even find an experimentally annotated version of human IRF9.....

Maybe you can check this out: [www.uniprot.org/uniprot/Q00978]

ValWood commented 8 years ago

On 25/04/2016 14:31, PhilLeMercier wrote:

About IRF9: IRF9 is a key element for interferon signaling in vertebrate. http://viralzone.expasy.org/all_by_protein/683.html . This antiviral pathway is 100% conserved in vertebrates and targeted by many viruses. We participated creating GO:0039560 "suppression by virus of host IRF9 activity" , and this term is essential from a virologist point of view. Maybe "IRF9 activity" could be replaced by "IRF9 signaling".

If it is the signalling pathway which is being regulated, this sounds more appropriate, although based on http://www.uniprot.org/uniprot/Q00978 should it be suppression by virus of host "type I interferon signaling pathway (GO:0060337)"

PhilLeMercier commented 8 years ago

We already had a lot of discussion before creating these terms.

Pretty much all the vertebrate virus have at least one protein that hit the Interferon pathway: about 39,000 entries have the "Inhibition of host interferon signaling pathway by virus" keyword in UniProt. That is why we think child terms are justified, like "suppression by virus of host IRF9 activity".

In my view "suppression by virus of host IRF9 activity" is part of " suppression by virus of host type I interferon-mediated signaling pathway"

cheers

Philippe Le Mercier, PhD Head of Swiss-Prot virus program SIB| Swiss Institute of Bioinformatics Geneva, Switzerland viralzone.expasy.org Tel: +41 22 379 58 70

On 25.04.2016 15:47, Val Wood wrote:

On 25/04/2016 14:31, PhilLeMercier wrote:

About IRF9: IRF9 is a key element for interferon signaling in vertebrate. http://viralzone.expasy.org/all_by_protein/683.html . This antiviral pathway is 100% conserved in vertebrates and targeted by many viruses. We participated creating GO:0039560 "suppression by virus of host IRF9 activity" , and this term is essential from a virologist point of view. Maybe "IRF9 activity" could be replaced by "IRF9 signaling".

If it is the signalling pathway which is being regulated, this sounds more appropriate, although based on http://www.uniprot.org/uniprot/Q00978 should it be suppression by virus of host "type I interferon signaling pathway (GO:0060337)"

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/geneontology/go-ontology/issues/12420#issuecomment-214336013Web Bug from https://github.com/notifications/beacon/ARy6Orlzv6Wvbgx8piviOdvBptuXiASAks5p7MXygaJpZM4INp8-.gif

ValWood commented 8 years ago

OK it seems this example is controversial. Instead I'll try to find an equivalent example from eukaryotes, and we can close this ticket.

Note that the problem here isn't that the virus's  are breaking biological rules. Molecular functions and processes target gene products, but gene products are not generally included in GO terms, although they are present in many legacy terms.  This bunch of related tickets is a first attempt to identify specific examples of the use of gene product names (and other term specificity issues).

This causes a problem for the information you want to capture here, so I probably picked a bad example for this type, but I wonder if this would be better phrased in terms of the regulation of the specific pathway, rather than of a specific gene product? (i.e. is the gene product name as a proxy for a specific signalling pathway/?). For example, the term "GO:0039560 suppression by virus of host IRF9 activity" has a single experimental annotation from PMID:17275127, but this publication is about  the effects on STAT1 phosphorylation, which is upstream (they examined NS5A on IFN-alpha signaling through STAT1 phosphorylation)? Maybe there is a better way to phrase these terms? (for example to make the viral suppression terms mirror the signalling processes the human genes are annotated with?).

Possible action: Remove the unused terms? Possible action: Rephrase terms to represent signalling pathways regulated?

ValWood commented 8 years ago

So, I just spotted another term: "suppression by virus of host STAT1 activity". The paper I refer to above (PMID:17275127) could just have easily have been annotated to this term. Do you see why this will lead to annotation inconsistencies?

The only reference to IRF9 in the above paper is "The phosphorylated STAT1–STAT2 heterodimer translocates to nucleus and complexes with IFN-regulatory factor 9 (IRF9) or P48. The resulting IFN-stimulated gene factor 3 (ISGF3) binds to the IFN-stimulated response ele- ments (ISREs) sequences in the promoters of IFNa/ b-stimulated genes (ISGs) [8]."

STAT1 http://www.uniprot.org/uniprot/P42224

PhilLeMercier commented 8 years ago

Dear Val,

The difficulty of virus ontology was to fill the needs of virologist without breaking the rules of GO ontology structure. In my view, Jane and Rebecca did a great job at that, and it was not without hot debates between GO and virus experts.

So, I just spotted another term: "suppression by virus of host STAT1 activity". The paper I refer to above (PMID:17275127) could just have easily have been annotated to this term. Do you see why this will lead to annotation inconsistencies? It's right that the PMID:17275127 seems wrong here, and indeed we don't have seen any evidence of HCV ever targeting IRF9. We see a lot of errors in GO assignments for viruses. I suggest you to dispute this particular annotation.

Best regards

Philippe Le Mercier, PhD Head of Swiss-Prot virus program SIB| Swiss Institute of Bioinformatics Geneva, Switzerland viralzone.expasy.org Tel: +41 22 379 58 70

On 26.04.2016 09:31, Val Wood wrote:

So, I just spotted another term: "suppression by virus of host STAT1 activity". The paper I refer to above (PMID:17275127) could just have easily have been annotated to this term. Do you see why this will lead to annotation inconsistencies?

The only reference to IRF9 in the above paper is "The phosphorylated STAT1–STAT2 heterodimer translocates to nucleus and complexes with IFN-regulatory factor 9 (IRF9) or P48. The resulting IFN-stimulated gene factor 3 (ISGF3) binds to the IFN-stimulated response ele- ments (ISREs) sequences in the promoters of IFNa/ b-stimulated genes (ISGs) [8]."

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/geneontology/go-ontology/issues/12420#issuecomment-214653784Web Bug from https://github.com/notifications/beacon/ARy6OmCfR9PZluYAB2mbG53LsQaEWuYIks5p7b9cgaJpZM4INp8-.gif

ValWood commented 8 years ago

In my view, Jane and Rebecca did a great job at that

I am not disputing that, but GO is an evolving structure. Any changes would still need to fulfil these needs, but there might be a more consistent AND GO-compliant way to do this.

plemercier commented 7 years ago

Dear all, I propose to obsolete those two terms because:

  1. they actually tend go beyond our intend of virus functional ontology
  2. we didn't use them for annotation

best regards

Philippe

pgaudet commented 7 years ago

After exchange with @plemercier and @pmasson55, we propose to obsolete the following terms:

GO GO term Number of annotations
GO:0039558 suppression by virus of host IRF7 activity by positive regulation of IRF7 sumoylation 1 annotation CACAO
GO:0039577 suppression by virus of host JAK1 activity by negative regulation of JAK1 phosphorylation 0 annotations
GO:0039583 suppression by virus of host PKR activity by positive regulation of PKR catabolic process 0 annotations
GO:0039551 suppression by virus of host IRF3 activity by positive regulation of IRF3 catabolic process 1 annotation AgBase
GO:0039559 suppression by virus of host IRF7 activity by positive regulation of IRF7 catabolic process 0 annotations
GO:0039582 suppression by virus of host PKR activity by positive regulation of PKR nuclear localization 0 annotations
GO:0039569 suppression by virus of host STAT2 activity by positive regulation of STAT2 catabolic process 0 annotations
GO:0039561 suppression by virus of host IRF9 activity by positive regulation of IRF9 localization to nucleus 0 annotations
GO:0039575 suppression by virus of host TYK2 activity by negative regulation of TYK2 tyrosine phosphorylation 0 annotations
GO:0039565 suppression by virus of host STAT1 activity by positive regulation of STAT1 catabolic process 2 annotations CACAO + AgBase
GO:0039571 suppression by virus of host STAT1 activity by negative regulation of STAT1 tyrosine phosphorylation 1 annotation AgBase
GO:0039567 suppression by virus of host STAT1 activity by negative regulation of STAT protein import into nucleus 0 annotations
GO:0039566 suppression by virus of host STAT1 activity by tyrosine dephosphorylation of STAT1 0 annotations
GO:0039568 suppression by virus of host STAT1 activity by inhibition of DNA binding 0 annotations
GO:0039570 suppression by virus of host STAT2 activity by negative regulation of STAT protein import into nucleus 0 annotations
GO:0039572 suppression by virus of host STAT2 activity by negative regulation of STAT2 tyrosine phosphorylation 0 annotations
GO:0039569 suppression by virus of host STAT2 activity by positive regulation of STAT2 catabolic process 0 annotations
GO:0039607 proteolysis by virus of host translation initiation factor 0 annotations
GO:0039608 suppression by virus of host translation initiation factor activity by induction of host protein dephosphorylation 1 annotation AgBase
GO:0039555 suppression by virus of host MDA-5 activity via MDA-5 binding 1 annotation uniprot
GO:0039541 suppression by virus of host RIG-I via RIG-I binding 1 annotation uniprot
GO:0039542 suppression by virus of host RIG-I K63-linked ubiquitination 0 annotations
GO:0039543 suppression by virus of host RIG-I activity by viral RNA 5' processing 0 annotations
GO:0039544 suppression by virus of host RIG-I activity by viral RNA 5' processing 4 EXP uniprot + some ISS
GO:0039546 suppression by virus of host MAVS activity by MAVS proteolysis 1 annotation Uniprot
GO:0039659 suppression by virus of host TBK1-IKBKE-DDX3 complex activity 0 annotations
GO:0039658 TBK1-IKKE-DDX3 complex 0 annotations
pgaudet commented 7 years ago

Obsoletion notice sent:

Dear all,

The proposal has been made to obsolete

GO:0039558 suppression by virus of host IRF7 activity by positive regulation of IRF7 sumoylation 1 annotation CACAO GO:0039577 suppression by virus of host JAK1 activity by negative regulation of JAK1 phosphorylation 0 annotations GO:0039583 suppression by virus of host PKR activity by positive regulation of PKR catabolic process 0 annotations GO:0039551 suppression by virus of host IRF3 activity by positive regulation of IRF3 catabolic process 1 annotation AgBase GO:0039559 suppression by virus of host IRF7 activity by positive regulation of IRF7 catabolic process 0 annotations GO:0039582 suppression by virus of host PKR activity by positive regulation of PKR nuclear localization 0 annotations GO:0039569 suppression by virus of host STAT2 activity by positive regulation of STAT2 catabolic process 0 annotations GO:0039561 suppression by virus of host IRF9 activity by positive regulation of IRF9 localization to nucleus 0 annotations GO:0039575 suppression by virus of host TYK2 activity by negative regulation of TYK2 tyrosine phosphorylation 0 annotations GO:0039565 suppression by virus of host STAT1 activity by positive regulation of STAT1 catabolic process 2 annotations CACAO + AgBase GO:0039571 suppression by virus of host STAT1 activity by negative regulation of STAT1 tyrosine phosphorylation 1 annotation AgBase GO:0039567 suppression by virus of host STAT1 activity by negative regulation of STAT protein import into nucleus 0 annotations GO:0039566 suppression by virus of host STAT1 activity by tyrosine dephosphorylation of STAT1 0 annotations GO:0039568 suppression by virus of host STAT1 activity by inhibition of DNA binding 0 annotations GO:0039570 suppression by virus of host STAT2 activity by negative regulation of STAT protein import into nucleus 0 annotations GO:0039572 suppression by virus of host STAT2 activity by negative regulation of STAT2 tyrosine phosphorylation 0 annotations GO:0039569 suppression by virus of host STAT2 activity by positive regulation of STAT2 catabolic process 0 annotations GO:0039607 proteolysis by virus of host translation initiation factor 0 annotations GO:0039608 suppression by virus of host translation initiation factor activity by induction of host protein dephosphorylation 1 annotation AgBase GO:0039555 suppression by virus of host MDA-5 activity via MDA-5 binding 1 annotation uniprot GO:0039541 suppression by virus of host RIG-I via RIG-I binding 1 annotation uniprot GO:0039542 suppression by virus of host RIG-I K63-linked ubiquitination 0 annotations GO:0039543 suppression by virus of host RIG-I activity by viral RNA 5' processing 0 annotations GO:0039544 suppression by virus of host RIG-I activity by viral RNA 5' processing 4 EXP uniprot + some ISS GO:0039546 suppression by virus of host MAVS activity by MAVS proteolysis 1 annotation uniprot GO:0039659 suppression by virus of host TBK1-IKBKE-DDX3 complex activity 0 annotations GO:0039658 TBK1-IKKE-DDX3 complex 0 annotations GO:0039578 suppression by virus of host JAK1 activity via JAK1 binding 0 annotations GO:0039581 suppression by virus of host PKR activity via double-stranded RNA binding 0 annotations GO:0039550 suppression by virus of host IRF3 activity by inhibition of DNA binding 0 annotations GO:0039549 suppression by virus of host IRF3 activity by inhibition of IRF3 phosphorylation 2 annotations CACAO + UniProt GO:0046779 suppression by virus of expression of host genes with introns

Annotations are indicated above. The reason for obsoleting is that they are too specific. The exact molecular roles of the individual proteins should be captured separately.

Comments can be made here: https://github.com/geneontology/go-ontology/issues/12420

We are opening a comment period for this proposed obsoletion. We'd like to proceed and obsolete this term on July 17, 2017.

Unless objections are received by July 17, 2017 , we will assume that you agree to this change.

Thanks,

Pascale

Also include: GO:0039605 TFIIB-class transcription factor binding involved in viral suppression of host transcription initiation from RNA polymerase II promoter

cmungall commented 7 years ago

Is the plan to make GOCAMs for each of the annotations here?

pgaudet commented 7 years ago

We can !

pgaudet commented 7 years ago

Missed GO:0039605 TFIIB-class transcription factor binding involved in viral suppression of host transcription initiation from RNA polymerase II promoter - 0 annotations