Closed gocentral closed 9 years ago
Logged In: YES user_id=473890 Originator: YES
The telomerase activity item, combined with all the snoRNA curation I've done lately, has got me thinking about how to fit base pairing into the function ontology.
For starters, I've just done three new terms just "base pairing activity", "DNA base pairing activity", and "RNA base pairing activity" (possible defs at end). There are three terms I've identified to be moved under either the "DNA base pairing activity" or the "RNA base pairing activity" terms:
With an expansion of the first level of children of the "RNA modification guide activity" term, that produces this:
- nucleic acid binding ; GO:3676 -- base pairing activity ; GO:new --- DNA base pairing activity ; GO:new ---- template for synthesis of G-rich strand of telomere DNA activity ; GO:332 --- RNA base pairing activity ; GO:new ---- RNA modification guide activity ; GO:30555 ----- RNA 2'-O-ribose methylation guide activity ; GO:30561 ----- RNA pseudouridylation guide activity ; GO:30558 ----- rRNA modification guide activity ; GO:30556 ----- snRNA modification guide activity ; GO:30566 ----- tRNA modification guide activity ; GO:30557 ---- triplet codon-amino acid adaptor activity ; GO:30533
The above all seems reasonable. Most of these child terms are already defined in terms of base pairing anyway.
Then, to focus in a slightly different way, you can look at some leaf terms that are specific to rRNA modification guide activity. If only the 3 new terms above added, then they have parentage up to 'RNA base pairing activity', and also up to 'rRNA binding', like this:
- RNA binding ; GO:3723 -- RNA base pairing activity ; GO:new --- RNA modification guide activity ; GO:30555 ---- rRNA modification guide activity ; GO:30556 ----- rRNA 2'-O-ribose methylation guide activity ; GO:30562 ----- rRNA pseudouridylation guide activity ; GO:30559 -- rRNA binding ; GO:19843 --- rRNA modification guide activity ; GO:30556 ---- rRNA 2'-O-ribose methylation guide activity ; GO:30562 ---- rRNA pseudouridylation guide activity ; GO:30559
A question I have is whether we would want to go farther and add terms like "rRNA base pairing activity", and so on for various types of RNA. It would add a lot of paths and might not get us much since we already have corresponding binding terms for everything.
But I think the 3 highest level base pairing terms could be added right now, and then the 3 existing terms I mention given the additional base-pairing parentage as above. For the telomerase template term, it could have its direct is_a parentage to the molecular_function root term removed because it would now have is_a parentage under 'DNA base pairing activity". Here are some possible defs for the 3 new terms.
base pairing activity - Interacting selectively with any nucleic acid via hydrogen bonds between the bases.
DNA base pairing activity - Interacting selectively with deoxyribonucleic acid (DNA) via hydrogen bonds between the bases.
RNA base pairing activity - Interacting selectively with ribonucleic acid (RNA) via hydrogen bonds between the bases.
thoughts?
-Karen
Original comment by: krchristie
Logged In: YES user_id=436423 Originator: NO
This all looks reasonable to me -- and it's good to see progress on it after such a long (albeit wholly understandable) lag. Thanks!
The one thing I might suggest is to add comments to some of the terms, especially DNA base pairing activity and its child(ren), to help avoid confusion -- make sure it's really super-clear that we mean base pairing with DNA, not base pairing by DNA (... unless I'm the only one who had to read it twice to get it).
m
Original comment by: mah11
Logged In: YES user_id=473890 Originator: YES
Hi Midori,
I have no objection to adding comments, but I don't think they are a complete solution to the problem of possible confusion since so many viewers don't see them. Thus it might be worth modifying the def to be as clear as possible to, if it can be done in a way that is within good GO practice.
The def I currently have suggested for 'DNA base pairing activity' is:
Interacting selectively with deoxyribonucleic acid (DNA) via hydrogen bonds between the bases.
Maybe this would be clearer and still OK within GO practice:
Interacting selectively with deoxyribonucleic acid (DNA) via hydrogen bonds between the bases of a gene product molecule and the bases of a target DNA molecule.
And for consistency, would that wording still work for the sibling "RNA base pairing activity" term?
Also, in your comment, you referred to "DNA base pairing activity and its child(ren)". This
brings me to my biggest question on adding the base pairing terms: Do we want to go farther and add terms like "rRNA base pairing activity", and so on for various types of RNA. While it would add a lot of paths, it does seem that it would be more consistent with the fact that we already have similar terms such as 'rRNA binding'.
thoughts?
-Karen
Original comment by: krchristie
Logged In: YES user_id=473890 Originator: YES
Hi again,
I worked through some ideas to see how it would come out. The specific issues are outlined below and I've put my working file, called: gene_ontology_basePairing_20080211.obo.gz
in this directory:
ftp://genome-ftp.stanford.edu/pub/people/curator/
-Karen
One idea is to name the terms "base pairing with " instead of " base pairing activity", except for the top level term. "Base pairing with nucleic acid" seems a little stupid, so I left it as just base pairing. I know we have most of the function terms ending with "activity", but not the binding terms. Since we don't require the binding terms to end with the word activity, perhaps it is consistent with current practice to allow the base pairing terms to also not end with the word activity. To me this wording is a clearer that the nucleic acid specified is the target, but I can accept if we need to keep the 'activity' word as part of these term names. Let me know what you think.
Defs - I've done the defs with this basic structure:
Interacting selectively with deoxyribonucleic acid (DNA) via hydrogen bonds between the bases of a gene product molecule and the bases of a target DNA molecule.
to try to be as clear as possible. Let me know if this works for you.
Comment: Note that with respect to annotation, "DNA base pairing activity" and its child terms are intended to be used to annotate the activity of gene products composed of nucleic acid, presumably RNA, to interact with DNA molecules via base pairing. Internal base pairing with itself is considered part of the secondary structure of the molecule and is not within the scope of GO function.
Do these serve the intended goal? If not, please suggest modifications.
Original comment by: krchristie
Logged In: YES user_id=436423 Originator: NO
1-3: All looks fine; good point about comments, but it doesn't hurt to include them for those who do read them!
4: If the rRNA, snRNA, etc. terms will be useful for annotations, might as well include them.
m
Original comment by: mah11
Logged In: YES user_id=473890 Originator: YES
Thanks for looking everything over Midori. I'll go ahead and put these terms in then.
-Karen
Original comment by: krchristie
Logged In: YES user_id=473890 Originator: YES
Going back to Midori's earlier comment on point #4, I don't know if the child terms of "base pairing with RNA", e.g. "base pairing with rRNA", will really be useful as direct annotation terms, but they make a lot of logical sense considering that we already have all the comparable "RNA binding" terms. In addition, it doesn't increase the number of paths any because anything that should be a child of a specific "base pairing with xRNA" no longer needs to be a direct child of "xRNa binding" since "base pairing with xRNA" will be the direct child of "xRNa binding".
So, I've added these new terms:
GO:0000496 base pairing GO:0000497 base pairing with DNA GO:0000498 base pairing with RNA GO:0000499 base pairing with mRNA GO:0000944 base pairing with rRNA GO:0000945 base pairing with snRNA GO:0000946 base pairing with tRNA
In addition, here are some notable parentage changes or additions: ---- template for synthesis of G-rich strand of telomere DNA activity ; now under "base pairing with DNA" instead of directly under "molecular function"
---- triplet codon-amino acid adaptor activity ; GO:30533 now has additional parentage under "base pairing with mRNA"
Using OE2!, I think I've done OK with all the paths, but please do check Midori.
-Karen
Original comment by: krchristie
Original comment by: krchristie
The subject of having molecular function terms for base pairing was discussed quite some time ago and arose again in this SF item:
[ 1585409 ] MF is_a orphan: template for synthesis of G-rich strand...
https://sourceforge.net/tracker/index.php?func=detail&aid=1585409&group\_id=36855&atid=440764
For reference, the discussion from the other item is below.
-Karen
Original Question:
template for synthesis of G-rich strand of telomere DNA activity, GO:0000332
only [p] telomerase activity
I don't know enough about this to assign an is_a parent but there's bound to be someone reading this that will.
This is the only is_a orphan in the function ontology.
Tanya
Date: 2006-10-27 17:19 Sender: kchris Logged In: YES user_id=473890
Hi,
I'm the person that added the term and off the top of my head I'm not sure where it would fit. Looking at the list of the direct children of molecular function (where a plus means it has children and a minus means it does not):
molecular function +antioxidant activity +catalytic activity +chaperone regulator activity -chemoattractant activity -chemorepellant activity +energy transducer activity +enzyme regulator activity +motor activity -nutrient resevoir activity -protein tag +signal transducer activity +structural molecule activity +transcription regulator activity +translation regulator activity +transporter activity -triplet codon-amino acid adaptor activity
The activity of this term seems closest to that of "triplet codon-amino acid adaptor activity", in that it involves base pairing. However, there aren't any terms for base pairing in the function ontology. I suppose base-pairing would be a type of binding...
Would we want to think about creating some base-pairing terms under binding activity?
-Karen
Date: 2006-10-31 03:07 Sender: gomidori Logged In: YES user_id=436423
I seem to recall a very brief discussion about base pairing terms a long time ago ... I think the catch was how to define them so as to avoid willy-nilly annotation of any old DNA or RNA sequence. But I don't think it's an insurmountable problem; I suspect the real reason no one followed through earlier is simply that other things claimed our attention.
Base pairing terms would solve the orphan problem neatly :)
Date: 2006-11-01 11:49 Sender: kchris Logged In: YES user_id=473890
as opposed to willy-nilly annotation of any old protein to 'protein binding' ;)
Hard to see how it could be worse than the annotations to the 'protein binding' term; there aren't nearly as many RNA genes.
-Karen
Date: 2007-01-19 03:06 Sender: gomidori Logged In: YES user_id=436423 Originator: NO
looking at this again at last ... I'm not opposed to having base pairing terms! As long as everone remembers that it's the gene product that annotations apply to (and I have no doubts about GOC annotators) there won't be any problem.
Karen, are you interested in working on some terms?
m
Reported by: krchristie
Original Ticket: "geneontology/ontology-requests/4156":https://sourceforge.net/p/geneontology/ontology-requests/4156