geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
223 stars 40 forks source link

Merge terms under 'GO:0000991 transcription factor activity, core RNA polymerase II binding' and children #15798

Closed pgaudet closed 6 years ago

pgaudet commented 6 years ago

This ticket is replaced by https://github.com/geneontology/go-ontology/issues/16053

Manual (non-ISS) annotations are here: https://docs.google.com/spreadsheets/d/1Apv6MiftFCXKHKHVY_ebkZG8PH49PL1HlPsAFW-8LK4/edit#gid=0

There is no action needed if we merge, this is just to check that the merge is OK.

pgaudet commented 6 years ago

@krchristie @ValWood @RLovering @srengel @vanaukenk @sylvainpoux @ggeorghiou @hattrill @bmeldal @hdrabkin

Can you please have a look before I implement ?

Thanks, Pascale

bmeldal commented 6 years ago

GO:0000990 transcription factor activity, core RNA polymerase binding -> merge into GO:0000993 RNA polymerase II core binding

Isn't that merging a more general term into a specific child? Are we sure all GO:0000990 annotations were meant to be for RNAPII?

hattrill commented 6 years ago

A quick glance at the FlyBase ones, I can see that they were probably made in a slightly different spirit - more aimed at capturing that they are regulating transcription but not DNA-binding TFs, I think. Will need a bit of time to review and re-house.

ValWood commented 6 years ago

1 I would obsolete GO:0000989 transcription factor activity, transcription factor binding and GO:0008134 transcription factor binding (28 manual annotations, excluding ISS)

and suggest using protein binding terms instead if appropriate, but also checking that they have the correct "general transcription initiation factor activity" or "cofactor activity" (since this is what most seem to be?

If we move them now we will probably need to move them again later. Better to just get them to their final home?

2 GO:0000993 RNA polymerase II core binding

Again, do we need this term. Why not just "protein binding" if demonstrated with the the correct "general transcription initiation factor activity" or "cofactor activity"

  1. Birgit is correct GO:0000990 transcription factor activity, core RNA polymerase binding is a polymerase independent term. It should not merge into GO:0000993 RNA polymerase II core binding

but again, I think it is confusing to have both the "general/cofactor" term AND a RNA polymerase or TF binding term to describe the same activities. This is why we all got confused in the first place.....

Personally I would obsolete or merge the binding branch into the cognate term in the non binding branch....

ValWood commented 6 years ago
  1. so here: https://github.com/geneontology/go-annotation/issues/1734 I propose to merge these 4 terms into 'GO:0000995' (to be renamed "general RNA polymerase III transcription factor activity", see geneontology/go-ontology#14790

so I just moved all of these this week to GO:0000995 as the new “general RNA polymerase III transcription factor activity” pending the name change.

but here you are saying GO:0000995 transcription factor activity, core RNA polymerase III binding -> merge into GO:0000994 RNA polymerase III core binding (5 manual annotations, excluding ISS) pombase

I think we need the “GO:0000995 general RNA polymerase III transcription factor activity”

ValWood commented 6 years ago

So, I thought we would be getting rid of "transcription factor binding" terms and try to just use the simplified terms relating to your schema ? If this is the case we don't want to migrate step-wise up the "transcription factor binding" branch.

But if you can unambigously map to the other branch, that would be useful?

ValWood commented 6 years ago

Arghh I'm so confused ;!!! ;)

pgaudet commented 6 years ago

@bmeldal

GO:0000990 transcription factor activity, core RNA polymerase binding -> merge into GO:0000993 RNA polymerase II core binding

Isn't that merging a more general term into a specific child? Are we sure all GO:0000990 annotations were meant to be for RNAPII?

yes, I checked.

pgaudet commented 6 years ago

@ValWood I corrected https://github.com/geneontology/go-annotation/issues/1734 In the end I just renamed GO:0000995 from transcription factor activity, core RNA polymerase III binding -> to RNA polymerase III general initiation factor activity (and doidn't merge the other terms in).

pgaudet commented 6 years ago

@ValWood

I think we need the “GO:0000995 general RNA polymerase III transcription factor activity”

This is done, it should trickle through shortly.

pgaudet commented 6 years ago

You are right about the final home. For sure that protein binding branch is not really a solution.

First: we don't have a term 'transcription factor activity', so it's really odd to have a precision on that. What we have is:

So if we want to describe what types of protein a protein binds to, it should be to one of the 3. So anyways these will need to be reviewed (this is why I wanted to 'park' them there for now).

I don't know if all the proteins in that list (see Google doc above) can have a home now. I think we're missing the elongation factor. (I can rescue that from the obsoletes). Can you spot anything else missing ?

Thanks, Pascale

ValWood commented 6 years ago

I'm still confused

GO:0000995 transcription factor activity, core RNA polymerase III binding -> merge into GO:0000994 RNA polymerase III core binding (5 manual annotations, excluding ISS) pombase

but yesterday: https://github.com/pombase/curation/issues/2011

Which term do I need (ID) for "RNA polymerase III general transcription initiation factor activity" ?

pgaudet commented 6 years ago

Hi @ValWood My bad !!! GO:0000995 = will become RNA polymerase III general transcription initiation factor activity

so the merge I propose above is irrelevant.

Thanks for picking that up !

Pascale

vanaukenk commented 6 years ago

@pgaudet I'm looking through the WB annotations. We have a range of gene products annotated to the terms cited above, e.g. general TFII subunits, obligate heterodimer specific TFs, and bona fide transcriptional co-activators. I'll need to sort through these to see exactly how we should re-annotate.

Wr the general pol II transcription initiation factors, though: GO:0016251 general RNA polymerase II transcription factor activity has been restored, right? To help guide curators, what kind(s) of experiment(s) are sufficient evidence to select this MF term for annotation?

pgaudet commented 6 years ago

@ValWood For GO:0008134 transcription factor binding (28 manual annotations, excluding ISS) I see 797 direct EXP annotations - are you sure about the 28 ? Also, there are many children to this term, we need to also look at those before obsoleting the parent.

ValWood commented 6 years ago

Did I say 28?

ValWood commented 6 years ago

I see 251 in QuickGO...

pgaudet commented 6 years ago

For GO:0008134 transcription factor binding ?

I see 943 EXP in QuickGO - and still 797 in AmiGO

RLovering commented 6 years ago

Hi Pascale

I think merging would be great as this will reduce the our annotation workload. However, I also think that as suggested getting the transcription regulator terms available so that people can try to create additional GO terms based on the transcriptional activities would be great, so that people can go through the spreadsheet.

Ruth

ValWood commented 6 years ago

should GO:0001190 transcriptional activator activity, RNA polymerase II transcription factor binding instead of -> merge into GO:0001085 RNA polymerase II transcription factor binding (47 manual annotations, including ISS)

(and repressor term) merge into

https://www.ebi.ac.uk/QuickGO/term/GO:0003713 transcription coactivator activity Molecular Function Definition (GO:0003713 GONUTS page) A protein or a member of a complex that interacts specifically and non-covalently with a DNA-binding transcription factor to activate the transcription of specific genes. Coregulators often act by altering chromatin structure and modifications. The Mediator complex, which bridges transcription factors and RNA polymerase, is also a transcription coactivator. PMID:10213677 PMID:16858867

It seems to fit the definition?

(Rather than the binding term?, or is this for something else?

ValWood commented 6 years ago

For example Rep2, is a co-activator for MBF transcription factor complex https://www.ncbi.nlm.nih.gov/pubmed/7588609

with the current proposal there is a loss of repressor/activator specificty

pgaudet commented 6 years ago

@ValWood how are you trying to annotate Rep2?

ValWood commented 6 years ago

Actually it isn't a coactivator, it's part of a DNA binding TF complex

but it also binds to another RNA pol II TF as part of the complex. I'm not so bothered about the binding term (we'll capture that with protein binding)

GO ahead with the merge , but when I finish I don't plan to have anything annotated to "GO:0001085 RNA polymerase II transcription factor binding".....

pgaudet commented 6 years ago

Annotation guidelines presented at today's annotation call are here: https://drive.google.com/drive/folders/11KY9lO9gFHa72B3OzWAfHRhVEH6Tbdon