geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
219 stars 40 forks source link

MP GO:0005658 alpha DNA polymerase:primase complex #11165

Closed gocentral closed 9 years ago

gocentral commented 10 years ago

is_a GO:1990077 primosome complex

should it be?

Reported by: ValWood

Original Ticket: geneontology/ontology-requests/10982

gocentral commented 10 years ago

and GO:1990077 would no longer require the broad synonym "primosome"

Original comment by: ValWood

gocentral commented 10 years ago

Original comment by: jl242

gocentral commented 10 years ago

Looks like it.

We've been working quite a lot on logical defs for the complexes in terms of the functions and processes they carry out so that we get these sorts of inferences automatically.

For primosome complex, could we make the definition:

'protein complex' and capable_of_part_of 'replication fork processing'?

or are there parts of 'replication fork processing' that aren't carried out by primosome complexes?

Original comment by: jl242

gocentral commented 10 years ago

Original comment by: jl242

gocentral commented 10 years ago

I don't think it's involved in replication fork processing ? As far as I know its in involved in lagging strand initiation?

Original comment by: ValWood

gocentral commented 10 years ago

This is the def:

Any of a family of protein complexes that form at the origin of replication and function in replication restart in all organisms. Early complexes initiate double-stranded DNA unwinding. The core unit consists of a replicative helicase and a primase. The helicase further unwinds the DNA and recruits the polymerase machinery. The primase synthesizes RNA primers that act as templates for complementary stand replication by the polymerase machinery. The primosome contains a number of associated proteins and protein complexes and is part of the replication initiation process.

Original comment by: jl242

gocentral commented 10 years ago

I think replication fork processing is narrower than, or precedes 'restart'. It depends where they both start and end. Fork processing involves the resolving of collapsed structures and breaks, and resolution of 2nd-ary structures etc via the MRe11 complex, etc before replication is reinitiated

So the alpha DNA polymerase:primase complex primary role is as for lagging strand primer synthesis. The role in restart is additional because you need to have initiation at these sites to restart replication.

This def is correct for GO:0005658 alpha DNA polymerase:primase complex for eukaryotic proteins (although it presumably also has a role in restart too)

Definition: "A complex of four polypeptides, comprising large and small DNA polymerase alpha subunits and two primase subunits, which catalyzes the synthesis of an RNA primer on the lagging strand of replicating DNA; the smaller of the two primase subunits alone can catalyze oligoribonucleotide synthesis." [GOC:mah, PMID:11395402]

maybe it should not go under GO:1990077 primosome complex, or the primosome complex should be tweaked to re-emphasise the role in initiation. Things may be different in prokaryotes though, I don't know....its confusing, but I wouldn't make a link to fork processing

Original comment by: ValWood

gocentral commented 10 years ago

Looks like there is a specific replication restart primosome in E. coli which is what may be causing the confusion:

http://cshperspectives.cshlp.org/content/5/5/a012815.full

There are two different primosome complexes in E. coli - one which forms at the OriC and the other is probably just for origin-independent replication i.e. restart.

So I think I'd just remove any reference to restart in the def and have it only talk about replication initiation, then later we might need another for primosome complexes specifically for restart (I think in euks it's just the same complex?).

So then we're safe to make primosome complex = 'protein complex' and capable_of_part_of 'lagging strand initiation'

However we don't have lagging strand initiation in GO so we may need to add it - we only have 'lagging strand elongation'.

Original comment by: jl242

gocentral commented 10 years ago

Original comment by: ValWood

gocentral commented 9 years ago

Original comment by: jl242

gocentral commented 9 years ago

Game plan:

  1. create new term, lagging strand initiation. Definition: The process in which the synthesis of DNA from a template strand in a net 3' to 5' direction is started. This should probably be part_of lagging strand elongation and is_a DNA metabolic process. I'm tempted to make it is_a DNA replication elongation. Any objections to that? From this part of the lagging strand elongation definition, the placement under DNA replication elongation seems correct. "Although each segment of nascent DNA is synthesized in the 5' to 3' direction..."
  2. create logical def for GO:1990077 primosome complex = 'protein complex' and capable_of_part_of 'lagging strand initiation'
  3. Edit first sentence of GO:1990077 definition from "Any of a family of protein complexes that form at the origin of replication and function in replication restart in all organisms." to "Any of a family of protein complexes that form at the origin of replication."

Will leave this open for a bit for comments.

Original comment by: tberardini

gocentral commented 9 years ago

Val asked me to comment, so:

for point (1), the first suggestion for parentage (part_of lagging strand elongation and is_a DNA metabolic process) sounds more accurate than the second (is_a DNA replication elongation), because lagging strand initiation isn't all of an elongation process.

point (3) will be an improvement -- if all priming is done by primosomes, then the primosome def shouldn't be restricted to replication restart

cheers! m

Original comment by: mah11

gocentral commented 9 years ago

I'll plan to implement tomorrow. Last chance for comments!

Original comment by: tberardini

gocentral commented 9 years ago

Hi Tanya,

I think I created the term for the E.coli primosomal complexes. If you search for GO:1990077 in the complex portal

http://www.ebi.ac.uk/intact/complex (11 hits)

you find that I annotated both types of complexes, replication initiation at the origin of replication complexes and replication restart at stalled forks complexes with this term. I had a discussion with Rebecca (I think!) about it before we created the terms as it was a bit tricky to find a solution (for all the replications terms in fact). In some places we even added eukaryotic and prokaryotic branches to the taxon-agnostic parent terms . We may need it here as well if in eukaryotes the primosomes are the same for both functions. In E.coli the subunits change (although some core proteins are present in all or most primosome complexes found in both processes).

Sorry for complicating the matter late on. I was off moving house!

Birgit

Original comment by: bmeldal

gocentral commented 9 years ago

Hi Birgit,

I can limit my edits to just the first (with only the relationships part_of lagging strand elongation and is_a DNA metabolic process) and last (edit def of complex) ones and leave the logical definition of primosome complex alone.

Would that be a good compromise?

Thanks,

Tanya

Original comment by: tberardini

gocentral commented 9 years ago

Hi Tanya, Val and Midori (and others),

I agree with:

Can we please keep a reference to replication fork re-start in the def? If you don't agree, then we have to make a new term for 'replication fork processing complex' and I have to move the relevant complexes into that section. However, it appears from this discussion that the eukaryotic primosomes carry out both functions anyway? Or did I miss something?

Thanks, Birgit

Original comment by: bmeldal

gocentral commented 9 years ago

Val, Midori,

Help, please. See Birgit's last request wrt the definition edit that was proposed.

Original comment by: tberardini

gocentral commented 9 years ago

I don't have a problem with mentioning that a primosome may act in replication restart. It seemed to me that the existing first sentence of the def could be taken to mean that primosomes act only in restart, which is wrong.

Looking at the full def, it also seems like the last sentence overemphasizes initiation. Priming is needed for initiation, lagging strand elongation, and restart, so the def might as well say so. How about:

"Any of a family of protein complexes that form at the origin of replication and function in replication primer synthesis in all organisms. ... [existing middle] ... The primosome contains a number of associated proteins and protein complexes and contributes to the processes of replication initiation, lagging strand elongation, and replication restart."

... or something to that effect, subject to further edits for accuracy and completeness. I leave the logical definition implications to the current GO editors!

Original comment by: mah11

gocentral commented 9 years ago

Thanks, Midori! Yes, I think my original def was a bit clumsy :(

What about:

"Any of a family of protein complexes that form at the origin of replication OR stalled replication forks and function in replication primer synthesis in all organisms. ... "

Birgit

Original comment by: bmeldal

gocentral commented 9 years ago

Sure, that sounds fine.

Also, it looks like the best way to connect up pol alpha:primase would be GO:1990077 primosome complex has_part GO:0005658 alpha DNA polymerase:primase complex (getting back to the original request!).

m

Original comment by: mah11

gocentral commented 9 years ago

Second attempt:

create new term, lagging strand initiation. Definition: The process in which the synthesis of DNA from a template strand in a net 3' to 5' direction is started. 

part_of lagging strand elongation and is_a DNA metabolic process.

create logical def for GO:1990077 primosome complex = 'protein complex' AND capable_of_part_of 'lagging strand initiation' AND capable_of_part_of GO:0031297 replication fork processing

Edit first sentence of GO:1990077 definition from 

"Any of a family of protein complexes that form at the origin of replication and function in replication restart in all organisms. Early complexes initiate double-stranded DNA unwinding. The core unit consists of a replicative helicase and a primase. The helicase further unwinds the DNA and recruits the polymerase machinery. The primase synthesizes RNA primers that act as templates for complementary stand replication by the polymerase machinery. The primosome contains a number of associated proteins and protein complexes and is part of the replication initiation process."

to

"Any of a family of protein complexes that form at the origin of replication or stalled replication forks and function in replication primer synthesis in all organisms. Early complexes initiate double-stranded DNA unwinding. The core unit consists of a replicative helicase and a primase. The helicase further unwinds the DNA and recruits the polymerase machinery. The primase synthesizes RNA primers that act as templates for complementary stand replication by the polymerase machinery. The primosome contains a number of associated proteins and protein complexes and contributes to the processes of replication initiation, lagging strand elongation, and replication restart."

Original comment by: tberardini

gocentral commented 9 years ago

Thanks Tanya,

Looks good to me :)

Birgit

Original comment by: bmeldal

gocentral commented 9 years ago

Looks fine to me too. Thanks! m

Original comment by: mah11

gocentral commented 9 years ago

Original comment by: tberardini

gocentral commented 9 years ago

Thanks everyone!

Added: +id: GO:0090629 +name: lagging strand initiation +namespace: biological_process +def: "The process in which the synthesis of DNA from a template strand in a net 3' to 5' direction is started." [GOC:mah, GOC:tb] +is_a: GO:0006259 ! DNA metabolic process +relationship: part_of GO:0006273 ! lagging strand elongation

Changed: id: GO:1990077 name: primosome complex namespace: cellular_component -def: "Any of a family of protein complexes that form at the origin of replication and function in replication restart in all organisms. Early complexes initiate double-stranded DNA unwinding. The core unit consists of a replicative helicase and a primase. The helicase further unwinds the DNA and recruits the polymerase machinery. The primase synthesizes RNA primers that act as templates for complementary stand replication by the polymerase machinery. The primosome contains a number of associated proteins and protein complexes and is part of the replication initiation process." [GOC:bhm, PMID:21856207] +def: "Any of a family of protein complexes that form at the origin of replication or stalled replication forks and function in replication primer synthesis in all organisms. Early complexes initiate double-stranded DNA unwinding. The core unit consists of a replicative helicase and a primase. The helicase further unwinds the DNA and recruits the polymerase machinery. The primase synthesizes RNA primers that act as templates for complementary stand replication by the polymerase machinery. The primosome contains a number of associated proteins and protein complexes and contributes to the processes of replication initiation, lagging strand elongation, and replication restart." [GOC:bhm, GOC:mah, PMID:21856207] +intersection_of: GO:0043234 ! protein complex +intersection_of: capable_of_part_of GO:0031297 ! replication fork processing +intersection_of: capable_of_part_of GO:0090629 ! lagging strand initiation

Original comment by: tberardini