geneontology / go-annotation

This repository hosts the tracker for issues pertaining to GO annotations.
BSD 3-Clause "New" or "Revised" License
34 stars 10 forks source link

PAINT annotations and "contributes to" rpb4 as an e.g #1319

Closed ValWood closed 8 years ago

ValWood commented 9 years ago

In the paint file rpb4 orthologs have PAINT annotations to

DNA-directed RNA polymerase activity single-stranded DNA binding single-stranded RNA binding

At SGD these have IDA annotations with "contributes to" because the complex binds DNA/RNA. I don't think any annotations with "contributes to" should be transferred by PAINT unless it is confirmed that this particular subunit is required for the activity.

Here, it isn't. rpb4 is a peripheral sub module and I don't think it is required for DNA or RNA binding ? it is involved in interactions between RNApol and mediator.

See http://www.sciencedirect.com/science/article/pii/S0968000404002725

Possibly we should review the practice of using contributes_to?

krchristie commented 9 years ago

Hi Val,

Having done both the majority of the SGD annotations for RNA polymerase subunits and also having done the PAINT annotations for this family (which I am looking at right now), I have a couple comments.

  1. Looking at the family in PAINT, the "RNA-directed RNA polymerase activity" term is not propagated to any sequences in the RPB4 family. I did not make these annotations in SGD. I recall looking at the paper briefly and did not feel that this term should be propagated, but also didn't feel it worth challenging.
  2. In case you meant the "contributes to" on the term "DNA-directed RNA polymerase activity", when I look both at what is present in PAINT, and the PomBase page for Rpb4 (http://www.pombase.org/spombe/result/SPBC337.14), the contributes to on "DNA-directed RNA polymerase activity" is present on the experimentally made IDA annotation of the pombe gene.
  3. My recollection from making the "RNA binding" and "DNA binding" annotations for Rpb4 in cerevisiae is that these are from assays done for the 4/7 subcomplex independently of the RNAP II holoenzyme. I definitely did not even make "RNA binding" or "DNA binding" annotations for subunits of the RNA polymerase complex unless there was evidence that that specific subunit was involved, which meant that I didn't make very many annotations for those terms for any of the 12 subunits of RNAP II in cerevisiae.

The 4/7 subcomplex has a unique and complex role in RNAP II. Certainly the 10 subunit enzyme lacking it is capable of polymerizing RNA in vitro on a template that circumvents the normally required initiation factors, so 4/7 is not required for the catalytic activity. However, the 4/7 complex has also been implicated as having some role in regulating initiation. The 4/7 subcomplex also appears to be involved in mRNA export from nucleus, deadenylation-dependent mRNA decay, and recruitment of 3'-end processing factors to the transcribing RNAP II.

So, in summary, while I agree that "contributes to" should be transferred cautiously, in this case it was transferred with a great deal of background knowledge and thought about what is actually known about the 4/7 subcomplex.

-Karen

ValWood commented 9 years ago

Hi Karen,

I think the main problem is that is annotation transfer is made from a gene with a "contributes_to" qualifier, there needs to be a mechanism to retain the "contributes_to" qualifier on the annotations which are created. Would you agree? (the contributes_to qualifier is not present in the PAINT file)

ValWood commented 9 years ago

So, my understanding is that "contributes_to" with a complex subunit means that the complex was annotated, not the individual gene product, or that multiple subunits contribute to the activity.

Therefore, it seems sensible to either

  1. not make the annotation transfer for a contributes_to annotation, OR,
  2. also ensure that the contributes_to qualifier appears in the annotation file. I think either solution would work?

I think the annotations make sense, but they need the "contributes_to" qualifier. If the gene product is known to bind RNA or DNA directly, presumably it wouldn't have the contributes_to qualifier?

krchristie commented 9 years ago

Hi Val,

There is a mechanism in the PAINT tool to choose whether or not to propagate a qualifier when it is present on a source annotation.

According to what I see in PAINT for this family, I did propagate the "contributes to" qualifier. I also find it present in the family specific GAF generated by PAINT for this family.

Also, when I look in MGI, I see the "contributes_to" qualifier is present on all three of the appropriate terms for the Polr2d gene in mouse: http://www.informatics.jax.org/go/marker/MGI:1916491

Could you provide more detail about exactly where you're seeing the problem.

-Karen

ValWood commented 9 years ago

I see this in the PomBase PAINT gaf....I'll have to dig around to tell you exactly where I got this from but it was the place Chris pointed me to, and it was only a couple of weeks ago....I'll see if I can track down the link...

PomBase SPBC337.14 rpb4 GO:0005665 PAINT_REF:21297 IEA PANTHER:PTN000480538 C DNA-directed RNA polymerase II complex subunit Rpb4 protein taxon:4896 20140415 GO_Central
PomBase SPBC337.14 rpb4 GO:0006367 PAINT_REF:21297 IEA PANTHER:PTN000480538 P DNA-directed RNA polymerase II complex subunit Rpb4 protein taxon:4896 20140415 GO_Central
PomBase SPBC337.14 rpb4 GO:0003727 PAINT_REF:21297 IEA PANTHER:PTN000480538 F DNA-directed RNA polymerase II complex subunit Rpb4 protein taxon:4896 20140415 GO_Central
PomBase SPBC337.14 rpb4 GO:0031369 PAINT_REF:21297 IEA PANTHER:PTN000480538 F DNA-directed RNA polymerase II complex subunit Rpb4 protein taxon:4896 20140415 GO_Central
PomBase SPBC337.14 rpb4 GO:0003697 PAINT_REF:21297 IEA PANTHER:PTN000480538 F DNA-directed RNA polymerase II complex subunit Rpb4 protein taxon:4896 20140415 GO_Central
PomBase SPBC337.14 rpb4 GO:0045948 PAINT_REF:21297 IEA PANTHER:PTN000480538 P DNA-directed RNA polymerase II complex subunit Rpb4 protein taxon:4896 20140415 GO_Central
PomBase SPBC337.14 rpb4 GO:0000288 PAINT_REF:21297 IEA PANTHER:PTN000480538 P DNA-directed RNA polymerase II complex subunit Rpb4 protein taxon:4896 20140415 GO_Central
PomBase SPBC337.14 rpb4 GO:0031990 PAINT_REF:21297 IEA PANTHER:PTN000480538 P DNA-directed RNA polymerase II complex subunit Rpb4 protein taxon:4896 20140415 GO_Central
PomBase SPBC337.14 rpb4 GO:0034402 PAINT_REF:21297 IEA PANTHER:PTN000480538 P DNA-directed RNA polymerase II complex subunit Rpb4 protein taxon:4896 20140415 GO_Central
PomBase SPBC337.14 rpb4 GO:0000932 PAINT_REF:21297 IEA PANTHER:PTN000480538 C DNA-directed RNA polymerase II complex subunit Rpb4 protein taxon:4896 20140415 GO_Central

ValWood commented 9 years ago

ignore the fact that they are IEA. I switch the evidence code temporarily so that I could filter redundant annotations before evaluation.

krchristie commented 9 years ago

Val,

Good luck with tracking this down.

However, I don't think I have anything else to add to this thread since I've checked that the "contributes_to" qualfier is being propagated appropriately in the original PAINT GAF and has been propagated all the way through to the mouse IBA annotations incorporated in MGI.

-Karen

ValWood commented 9 years ago

I found it. @cmungall pointed me here: exported for consumption into MODs: http://www.geneontology.org/gene-associations/submission/paint/pre-submission/

So this file does not appear to have "contributes to"

Chris, which tracker do you want me to move this to? (no hurry...this is parked for me for a few weeks)

selewis commented 8 years ago

Touchup and PAINT v2 use GoLR (rather than mySQL) to load the experimental annotations. Turns out the GoLR load wasn't including the qualifiers. Heiko has fixed it. Changes should percolate into pre-submission as soon as Touchup is run again.