geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
219 stars 40 forks source link

NTR for Sca1 protein complex #12789

Closed rjdodson closed 7 years ago

rjdodson commented 7 years ago

Hello: I would like to request a new GO component term for a protein complex identified in Dictyostelium discoideum.

Name: Sca1 complex

Ontology: cellular_component

Synonyms: Ras signaling complex, Sca1 signaling complex

Definition: A multi-protein complex which functions biologically to control the RasC-TorC2 pathway and thereby regulate cell motility, chemotaxis, and the relay of the cAMP chemoattractant signal in Dictyostelium cells. The core components of the Sca1 complex are the Ras guanine exchange factor (RasGEF) Aimless, RasGEFH, protein phosphatase 2A subunit A (PP2A), and a scaffold protein designated Sca1, and also including the pleckstrin homology domain protein PHR and protein phosphatase PP2A-C2.

The position of this term in the GO hierarchy should be as a child of:

GO:0043234 protein complex

References: PMID:20493808, A Ras signaling complex controls the RasC-TORC2 pathway and directed cell migration.

See Fig. 1K in the above reference for a diagram of the complex. Please let me know if you have any questions. Bob Dodson

bmeldal commented 7 years ago

Hi Rob,

Could you please add the relationships to MF and BP (if applicable) so we can axiomise the terms more precisely? We are looking for capable_of and capable_of_part_of .

A part_of would also be helpful.

Is there no intermediate complex node this term could be a child of?

Many thanks, Birgit (Complex Portal curator at IntAct, EBI)

rjdodson commented 7 years ago

Hi Birgit:

Sorry to be behind the curve here, but I'm not completely sure what you're asking. Do you mean add MF and BP terms to those Dictyostelium proteins mentioned as part of the complex? If so, I am currently doing so (but have not yet finished).

Also, not sure what you mean by capable_of, capable_of_part_of, and part_of. Could you please explain further, or provide a link to a GO page explaining these so I can educate myself?

I am not aware of an intermediate complex node that this term may be a child of.

Pardon my ignorance, but I haven't submitted a GO complex NTR before, so I'm not sure of a lot of this.

Thanks for helping to improve my request, Bob

bmeldal commented 7 years ago

Hi Bob,

apologies for leaving you confused (hopefully not dazed ;-) !). I don't know where in the world you are based but I had some detailed discussions with Petra at the GOC meeting in Geneva so she could give you some pointers as well.

I noticed the ticket is marked "editors-discussion" so I shall be brief (sort of!):

GO are trying to automatise more and more processes and as we define the "complex" branches/classes more we can do a lot of re-assignments automatically if we use relationships when creating the terms. The easy one is = just state where in the cell or extracellular region/space the complex is found. This could be as loose as or as precise as etc...

links to any MF term the complex is related to, so if it's an enzyme then it would be capable_of some or if it binds to X then if would be capable_of some . would be the same for BP terms, e.g. . In your case probably and if those terms exist. Just list them as free text and the editor will copy them into the tool when creating the new term. More detailed guidelines can be found here: http://wiki.geneontology.org/index.php/Guidelines_on_%27protein_complex%27_terms#Adding_appropriate_part_of_relationships Birgit
rjdodson commented 7 years ago

Hi Birgit:

Thanks for your response and for providing the link to the guidelines, both are most helpful. And yes, Petra and I can discuss further as well.

I'll try and answer a few questions initially, maybe add more later.

I see from reading your guidelines that the location of the complex may actually be our first problem. That is because the Sca1 complex has two been identified in locations: it is primarily located in the cytosol; however it is recruited to the plasma membrane (specifically to the leading edge membrane of chemotaxing cells) during chemoattractant stimulation. Exactly how it is recruited is unknown. I have annotated the scaffold protein (Sca1) of the complex with 3 GO terms:

GO:0005829 cytosol GO:0005886 plasma membrane GO:0031252 cell leading edge (with annotation extension during chemotaxis to cAMP)

Regarding the MF terms for the complex, here is what I have currently curated in Protein2GO for 4 of the protein components in the complex:

Sca1 (DG1105) GO:0032947 protein complex scaffold GO:0005515 protein binding (binds to: gefA, aka Aimless) GO:0005515 protein binding (binds to pppA, the protein phosphatase)

gefA GO:0005088 Ras guanyl-nucleotide exchange factor activity GO:0005515 protein binding (binds to: Sca1, the scaffold protein) GO:0005515 protein binding (binds to gefH, another RasGEF)

pppA GO:0005515 protein binding (binds to: Sca1, the scaffold protein)

gefH GO:0005515 protein binding (binds to: gefA, aka Aimless)

So the authors have shown that Sca1 is the scaffold, and that it binds to both pppA and gefA, and also that gefA binds to gefH. Other proteins appear to be associated with the complex, but binding was not firmly established in this reference. And there are other GO terms not mentioned above, eg. pppA is a protein serine/threonine phosphatase. Would you like me to provide here a complete list of all MF GO terms for all proteins in the complex?

With regard to BP, I think the main process is: GO:0007265 Ras protein signal transduction

However, there are other developmental and regulatory processes which are affected by this (some of which I have currently only curated as strain phenotypes in dictyBase-not in GO), so my point is, I could include more processes if desired.

I am not sure if this complex is specific to dicty, or if it is found in other species.

Sorry for the long winded-response, I hope this answers some of your questions, but may raise others. Anyway, let me know and I'll be happy to provide more information as needed.

Thanks. Bob

bmeldal commented 7 years ago

Thanks, Bob.

Location: use the one where it is functional. If it's in the cytosol because it gets released from the ER and recruited to the PM but acts on the leading edge then the latter is its location as functional unit. It gets complicated when you have things like nucleocytoplasmic shuttling complexes! They become part_of cell...

Activities: Ignore protein binding - that applies to almost all complexes. Only use binding if there is a more specific target, such as the DNA regions in transcription etc.

Sca1 (DG1105) GO:0032947 protein complex scaffold

Ignore this as it's Sca1's function WITHIN the complex.

gefA GO:0005088 Ras guanyl-nucleotide exchange factor activity

Is this the activity of the complex? If yes, then this will be its relationship! And it also allows the complex to have the is_a relationship to "GO:0032045 guanyl-nucleotide exchange factor complex". NB: If the complex has an activity but no specific related complex term exists in GO we would like to create one in order to make the hierarchy more granular. This can be done via TG templates (if you have been shown how to use them) or by requesting them in a GH ticket :)

pppA is a protein serine/threonine phosphatase. Would you like me to provide here a complete list of all MF GO terms for all proteins in the complex?

We only add the terms if the activity applied to the COMPLEX, so if the complex has phosphatase activity then "yes, please". In that case it will also have the is_a relationship to "GO:0008287 protein serine/threonine phosphatase complex".

With regard to BP, I think the main process is: GO:0007265 Ras protein signal transduction However, there are other developmental and regulatory processes which are affected by this (some of which I have currently only curated as strain phenotypes in dictyBase-not in GO), so my point is, I could include more processes if desired.

Happy to include all that have some sort of experimental evidence.

I am not sure if this complex is specific to dicty, or if it is found in other species.

A quick search in Uniprot or pubmed for related proteins might give you an idea.

Finally, the Def should then reflect the activities the complex is in involved in a little more detail.

Happy complex curation!

Birgit

rjdodson commented 7 years ago

Hi Birgit:

Instead of directly responding to all points in your previous comment, I'm going to address a couple, and then submit a revised request below-hopefully taking everything into account. First, the authors individually knocked out all the genes in the complex to assess phenotypes, so the GO BP and MF terms are all based on IMP evidence. The Ras guanyl-nucleotide exchange activity associated with gefA and gefH is the only activity shown for the complex. Phosphatase activity is not shown. So, my definition is weighted toward processes as opposed to activities, as this is what is emphasized in the reference. I included capable_of and capable_of_part_of terms after the definition. I also did a search in Uniprot and pubmed and did not find the complex in other species. The authors note that scaA does not have orthologs in other species (other than Dictyostelids), so it may be that this complex is specific to Dictyostelium.

Below is my revised request. Let me know if I've left out something and I'll follow-up.

Thanks, Bob


Name: Sca1 complex

Ontology: cellular_component

Synonyms: Ras signaling complex, Sca1 signaling complex, ScaA signaling complex

Definition: A multi-protein complex capable of Ras guanyl-nucleotide exchange factor activity. The primary role of the Sca1 complex is to promote RasC activation. The Sca1 complex functions biologically to regulate cell motility, chemotaxis, and the relay of the cAMP chemoattractant signal in Dictyostelium cells. The core components of the Sca1 complex are the Ras guanine exchange factor (RasGEF) Aimless, RasGEFH, protein phosphatase 2A subunit A (PP2A), and a scaffold protein designated Sca1, and also including the pleckstrin homology domain protein PHR and protein phosphatase PP2A-C2.

capable_of: GO:0005088 Ras guanyl-nucleotide exchange factor activity

capable_of_part_of: GO:0046579 positive regulation of Ras protein signal transduction capable_of_part_of: GO:0043327 chemotaxis to cAMP

location: GO:0031252 cell leading edge

The position of this term in the GO hierarchy should be as a child of:

GO:0043234 protein complex

References: PMID:20493808, A Ras signaling complex controls the RasC-TORC2 pathway and directed cell migration.

ukemi commented 7 years ago

Hi Rob and Birgit,

I thought I'd comment since I claimed this ticket and have tagged it with 'editors-discussion'. My concern with the ticket is the specificity of the complex and the apparent generic function of the complex. In the long run I'm not sure how easy it will be to keep things complete and consistent in the ontology as we go down the road of adding complexes like this. For example, if we find a mammalian complex that functions in the same process and we add it to the ontology, is there a way for us to identify that this complex should be related? What if we find that a very similar complex in subunit structure carries out a very different process? If you have any ideas, I'm all ears. I will let you know when this comes up on the agenda for discussion on a call.

rjdodson commented 7 years ago

Hi David:

Thanks for commenting. It is an interesting question and I guess does present concerns about future complex requests. Reviewing your guidelines at:

http://wiki.geneontology.org/index.php/Guidelines_on_%27protein_complex%27_terms#Adding_appropriate_part_of_relationships

I see that this complex seems to meet 7 out of the 8 rules in the guidelines. That is all except rule 2: Is the complex species-agnostic.

I again searched pubmed and don't see any further results of this complex in other species. That, and the fact that the ScaA lacks non-dicty orthologs, suggests that it is probably species-specific. There are also no follow-up papers from the authors that I can find, so this has not really been pursued much in Dicty. Some of the author's comments in the discussion section address one of your questions above "What if we find that a very similar complex in subunit structure carries out a very different process?" I'll just quote them:

"Although Sca1 does not appear to be evolutionarily conserved, we found that the Dictyostelium genome encodes a Sca1-related protein, designated Sca2 (DDB_G0267776). Interestingly, we have preliminary data suggesting that Sca2 also associates with two LisH domain-containing RasGEFs, RasGEF-F and RasGEFI (Wilkins et al., 2005), as well as the PP2A-A/C2 core enzyme (S. Lee, P.G.C., and R.A.F., unpublished data). We do not yet know which Ras proteins are regulated by RasGEFF and RasGEFI, but it appears that the Sca2 complex regulates different cellular functions than those controlled by the Sca1 complex. Thus, the existence of RasGEF-containing complexes extends beyond the Sca1 complex described here, although its presence in other organisms is unknown."

So back in 2010 there was some preliminary, but curiously still unpublished, evidence that a similar complex in dicty regulates different processes.

So where does that leave us? I see in the guidelines you state: "Species-specific complexes don't belong in GO, but IntAct/Complex Portal and/or PRO can take them."

So if that restriction applies here, then maybe this request should just be rejected. But then if so, how should I curate these in GO? Do I just attach the generic parent term GO:0043234 protein complex to each of the proteins involved, and then use the with/from field to specify the other members of the complex? Or any other suggestions?

Thanks, Bob

bmeldal commented 7 years ago

Hi Bob and David,

The Ras guanyl-nucleotide exchange activity associated with gefA and gefH is the only activity shown for the complex.

This means that they have shown an activity and you can use the more specific component term of GO:0032045 guanyl-nucleotide exchange factor complex for all participants. We could even create the Ras-specific child based on the activity term. That would be ok as there are lots of Ras guanyl-nucleotide exchange factor complexes. This can be done in TG. That is what I would do if I was creating the entry in the Complex Portal.

David's concern is that if we add lots of very (species-)specific complex terms we inflate the ontology to the point where it becomes less meaningful. We have had these discussions for a while... In the past people would create such complexes and that's why there is so little granularity within the protein complex and marcromolecular complex classes.

David, let us know asap what the Editors prefer to do but it looks like it could be a fairly easy complex to create in the Complex Portal, then Rob can use our AC as annotation target for all complex members. We don't have any dicty complexes yet but that won't stop us creating some!

The Complex Portal is here if you'd like to have a snoop around: www.ebi.ac.uk/intact/complex

Birgit

PS: I wonder if @ValWood would like to chime in ;-)

rjdodson commented 7 years ago

Hi Birgit:

Thanks for your comment-just two follow-up points from me. First, it may be important to note that the Ras guanyl-nucleotide exchange activity function is based on IMP evidence. That is, this activity for these proteins is inferred by me based on a process phenotype-which I have curated as positive regulation of Ras protein signal transduction. There is no IDA evidence that gefA and gefH have this activity. I think this is allowed annotation practice, but I am always hesitant to curate MF terms using IMP as ev_code. Second, I completely agree with David's concern about too many species-specific component terms (same for MF and BP terms too). An inflated and unwieldy ontology is in no one's best interests. So if you'd prefer me to add this just to the complex portal instead, I'm fine with that.

Let me know, and in the meantime I'll try and poke around a bit in the complex portal if I have time.

Thanks, Bob

ValWood commented 7 years ago

PS: I wonder if @ValWood would like to chime in ;-)

I was avoiding it;) I had a quick look last night to see if I could detect any conservation (especially for the scaffold subunit)...but Dicty proteins have all of those funny insertions, so I gave up. However the scaffold proteins in these complexes are notoriously difficult to detect orthologs for, even when the complexes are clearly conserved.

For example in this SIN signalling GTPase complex sin we have been unable to detect the Sid4 component outside fungi although the rest of the complex is conserved throughout eukaryotes. I didn't request a complex term for this, because it is difficult to define the boundaries of the functional complex. Different components come together at different stages....

Analogously, there are always GTPase complexes operating upstream of TOR (multiple, presumably). This request appears to be analogous to one of these:

2

However, this request appears to refer to a partial/non functional complex before the Ras guanyl-nucleotide exchange factor", is associated with the GTPase it activates?

Also, because these networks are so complicated and the complex components are constantly changing over time it might not be very valuable to create these. To illustrate, this is the network automatically generated from GO data for the cdc42 GTPase region. The purple connections represent this complex http://www.pombase.org/spombe/related/GO:0071521. The orange arrows represent "activities" cdc42-network

I would no longer request this complex. Instead now I would look for the information abut how ras1 and scd2 are connected "functionally" in this pathway and avoid the complex quagmire.

(I would still request a complex for a clearly defined, conserved, functional unit)

rjdodson commented 7 years ago

This is a GTPase complex operating upstream of Tor. Here is the author's schematic of this signaling pathway, also indicating that the GTPase RasC is not part of the complex the author's identified.

fig7

ValWood commented 7 years ago

But this is just a "snaphot in time". When the complex if functioning as a GEF, it needs to be interacting with the GTPase.

rjdodson commented 7 years ago

Agreed, this is just a snapshot in time. I assigned GO protein binding terms to just those proteins which the authors experimentally determined. In the diagram above that is Sca1, PP2A-A, Aimless and GEFH. In my definition of the protein complex in this request, I also stressed these same proteins and was using this figure above as my guide. The authors did not establish any interaction between RasC and the complex, and they don't really refer to RasC as being part of the complex, so I just naively followed their description. So, should I have also included RasC in the definition as part of the complex then? Sorry, I have not requested a complex GO term before and am not really so familiar with all of this.

Thanks, Bob

bmeldal commented 7 years ago

I was avoiding it;)

But your input has been helpful!

Sounds like there are a lot of uncertainties. Rob's IMP annotations of the gene products to the specific Ras GEF activity should be fine if that's what has been shown in the paper. I haven't done much on GEFs or GTPases but I have a bunch of mTOR related complexes sitting on my checking list!

Whenever I have the dilemma of changing compositions I try to drill down to the components that make up the complex at the point of its biological activity (as far as we know at the time of curation!). That means some chaperones or scaffolding proteins will not be part of it but regulators may well be. I would then mention the 'floating' units in the functional description. And if there's uncertainty I mention it, too. A helpful thought is always to think of Reactome (I appreciate, there probably isn't one for dicty) and its assembly reactions. It's the final product that counts! If you then have a fairly well defined complex you can use that as annotation object.

Sorry, I have not requested a complex GO term before and am not really so familiar with all of this.

It's a steep learning curve and we are still trying to find a sensible, workable solution. Complexes are - well - complex!!!

rjdodson commented 7 years ago

So sounds like this is a complex that should be added to the Complex Portal rather than to GO. But just for the sake of completeness, here is a revised definition for the complex including RasC. I've added rasC, but kept Sca1 in the definition as the complex is named after this protein in this reference. So should I request this to be added to the Complex Portal instead? Thanks, Bob

Name: Sca1 complex

Ontology: cellular_component

Synonyms: Ras signaling complex, Sca1 signaling complex, ScaA signaling complex

Definition: A protein complex capable of Ras guanyl-nucleotide exchange factor activity and formed by the association of the GTPase RasC with additional proteins. These include RasGEFA (Aimless), RasGEFH, protein phosphatase 2A subunit A (PP2A), and a scaffold protein designated Sca1. Additional members may include a pleckstrin homology domain protein (PHR) and a protein phosphatase regulatory protein (PP2A-C2). The Sca1 complex functions biologically to regulate cell motility, chemotaxis, and the relay of the cAMP chemoattractant signal in Dictyostelium cells.

capable_of: GO:0005088 Ras guanyl-nucleotide exchange factor activity

capable_of_part_of: GO:0046579 positive regulation of Ras protein signal transduction capable_of_part_of: GO:0043327 chemotaxis to cAMP

location: GO:0031252 cell leading edge

The position of this term in the GO hierarchy should be as a child of:

GO:0043234 protein complex

References: PMID:20493808, A Ras signaling complex controls the RasC-TORC2 pathway and directed cell migration.

ukemi commented 7 years ago

@bmeldal Would you request this complex as a GO complex? I still want to discuss it with the editors, but one solution would be to annotate the proteins to (GO:0032045) 'guanyl-nucleotide exchange factor complex'.

Would you include the Ras as part of the complex? I don't know why but that seems strange to me since it is the thing on which the GEF is acting.

It's a steep learning curve and we are still trying to find a sensible, workable solution. Complexes are - well - complex!!!

I will second that!

bmeldal commented 7 years ago

I'm on a group retreat Mon-Wed. I'll come back to it on my return. I haven't actually read the paper yet so I don't know the fine details of what's part of the complex, I have so far gone by what Rob summarised.

Speak to you soon!

rjdodson commented 7 years ago

Thanks for the suggestion. I have now added the GO term: GO:0032045 guanyl-nucleotide exchange factor complex, to members of this complex.

Birgit, it would be great if you want to read the paper-it would probably be good to have someone more experienced review this as I could have made errors. But, not trying to create more work for you!

Anyway, thanks for both of your comments. Bob

ValWood commented 7 years ago

@ukemi Would you include the Ras as part of the complex? I don't know why but that seems strange to me since it is the thing on which the GEF is acting.

This might be naive old text book knowledge, but I always thought that when the GEF was active it was necessarily in a complex with the Ras. Now I think about it that is likely to be nonsense.

bmeldal commented 7 years ago

I have read the paper and to me it doesn't look like RasC is part of the complex but it's its target. The paper has enough info to curate the complex in the CP but I'll wait for the Editors to make their decision in general. As it acts on Ras, I would create the more specific "Ras guanyl-nucleotide exchange factor complex" as child of "GO:0032045 guanyl-nucleotide exchange factor complex" based on the existing "GO:0005088 Ras guanyl-nucleotide exchange factor activity".

@ukemi when will you have a decision from the Editors on this?

bmeldal commented 7 years ago

Protein IDs:

These 3 are straightforward: Aimless/gefA = Q54PQ4 (GEFA_DICDI, Ras guanine nucleotide exchange factor A) RasGEFH/gefH = Q8IS16 (GEFH_DICDI, Ras guanine nucleotide exchange factor H) PP2A (PP2A-A) /pppA = Q54QR9 (2AAA_DICDI, Serine/threonine-protein phosphatase 2A regulatory subunit pppA)

UniProt/naming inconsistency: PP2A-C2/pho2B = Q54RD6 (PP2AB_DICDI) The UniProt name is "Probable serine/threonine-protein phosphatase 2A catalytic subunit B " while the paper describes it as a C subunit - does the UniProt entry need mending? Tbh, the protein name is C2 and the gene name is 2B, that doesn't help!

Only in TrEMBL: If I curate this complex it would be nice to have these protein in UniProt (rather than TrEMBL)...

Sca1/scaA = Q54XY4 (Q54XY4_DICDI, Uncharacterized protein, UniProt gene name = DG1105) = dictyBase gene ID = DDB_G0277843, Name Description=Developmental Gene SCAffold protein

and

PHR/phr = Q1ZXQ0 (Q1ZXQ0_DICDI, Uncharacterized protein, UniProt gene name = DDB_G0270932) = dictyBase gene ID=DDB_G0270932, Name Description=Pleckstrin Homology domain RasGTPase-related domain

rjdodson commented 7 years ago

Hi Birgitt:

Thanks for your comments and for reading the paper. I looked into this a bit and found some problems with the protein phosphatase protein names which I have appended at the end. But as executive summary, I recommend the following as names for Sca1, pho2B, PHR, and pppA:

Q54XY4 gene: scaA, sca1; protein: Sca1 complex scaffold protein (I have named gene in dictyBase as scaA, using sca1 and DG1105 as synonyms.

Q1ZXQ0 gene: phr; protein: Sca1 complex protein PHR I also have alternate name: pleckstrin homology domain-containing protein

Q54RD6: gene: pho2B; protein: protein phosphatase 2A catalytic subunit 2

Q54QR9 gene: pppA; protein: protein: protein phosphatase 2A scaffold subunit Uniprot has this incorrectly named as a regulatory subunit.

Also in my definition above I also referred to pho2B as a regulatory subunit, but this is wrong. ("Additional members may include a pleckstrin homology domain protein (PHR) and a protein phosphatase regulatory protein (PP2A-C2)." This should be "protein phosphatase catalytic subunit (PP2A-C2)".

Let me know or if you have any questions or objections, or if this is not clear (I found it confusing!) and we can follow up from there.

Thanks, Bob

Below are notes on the protein phosphatase complex gene nomenclature, sort of off on a tangent with regard to this ticket and mostly for my own benefit. You can read if interested, but not necessary.

The nomenclature here is confusing, but here is what I think is going on:

Note that the authors of this reference refer to Janssens et al 2008 and mention that the protein phosphatase genes themselves form a complex typically consisting of 3 subunits: -core (aka scaffold) structural gene (subunit A) -catalytic subunit (subunit C) -regulatory subunit (subunit B)

The authors have identified the core (pppA) and the catalytic (pho2B) subunits as being part of the Sca1 complex. They do not identify a regulatory subunit.

This becomes confusing because the dicty genome appears to contain 5 protein phosphatase complex genes: 1 core subunit gene, 2 catalytic subunit genes and 2 regulatory subunit genes.

1) protein phosphatase 2A scaffold subunit (dictyBase gene pppA), uniprot Q54QR9 (subunit A following Janssens)

2) protein phosphatase 2A catalytic subunit 1 (dictyBase gene pho2a), uniprot Q9XZE5 3) protein phosphatase 2A catalytic subunit 2 (dictyBase gene pho2b), uniprot Q54RD6 (subunit C following Janssens )

4) protein phosphatase 2A regulatory subunit 1 (dictyBase gene phr2ab), uniprot Q54Q99 5) protein phosphatase 2A regulatory subunit 2 (dictyBase gene psrA), uniprot Q54VB6 (subunit B following Janssens)

So, here I think the authors are calling pho2B the C2 subunit to distinguish it from the catalytic subunit of the pho2a gene (C1). Uniprot refers to these as catalytic subunit A and B, respectively. I think the protein name in Uniprot that is incorrect is that attached to Q54QR9. This is not a regulatory subunit, but rather a scaffold subunit. The names for these proteins in dictyBase is also confused and need to be updated.

I think use of "..subunit 1, or subunit 2" is less confusing for these, so I will change all dicytBase gene products to match those noted above (keeping names like PP2A-C2 as alternate names of course).

bmeldal commented 7 years ago

Bob,

Thanks for digging into the naming issue. Can I suggest you update dictyBase with the best knowledge to date and then send a correction request to UniProt (through their "contact us") and ask them to update UniProt accordingly? They might have a script that updates the MOD entries with every release, I don't know the details.

I'd prefer not to spend more time chasing names as I'm no dicty expert (this is my first proper dicty interactions apart from chatting to Petra in Geneva!).

Birgit

ukemi commented 7 years ago

As it acts on Ras, I would create the more specific "Ras guanyl-nucleotide exchange factor complex" as child of "GO:0032045 guanyl-nucleotide exchange factor complex" based on the existing "GO:0005088 Ras guanyl-nucleotide exchange factor activity".

This seems pretty straightforward to me since we have the molecular function term. So this will simply be a protein complex that is capable of the function. I'll keep this ticket on an upcoming agenda for editors just as an example, but I think for this specific case this is resolved if you are ok with it.

bmeldal commented 7 years ago

How do we make the new term while the TG logon is being moved over to GH? If @rjdodson can wait a few more days we might be able to use the TG template easing the editors' workload. He can then annotate the proteins to the new term. Or do you still need the specific CP term as annotation object?

@rjdodson have you got access to TG?

rjdodson commented 7 years ago

Hi Birgitt and David:

First Birgitt, I have updated these gene and protein names in dictyBase, then done as you suggested previously and submitted correction requests to Uniprot for the relevant genes.

And yes, I do have access to Term Genie. So if I understand correctly, I should wait a few days and then as David suggested request a new GO term "Ras guanyl-nucleotide exchange factor complex" using Term Genie. Is that the plan then?

Thanks, Bob

ukemi commented 7 years ago

That sounds great. I will keep this open until the TG term has been approved.

ukemi commented 7 years ago

Hi Bob,

It looks like you can use TG now. Let me know if you are successful.

-D

bmeldal commented 7 years ago

Yes, TG looks 'normal' this morning.

Bob, let me know when you have created the new GO term and I will get on with adding it to the CP. However, it takes time for it to filter through our checking and release process before it appears in P2GO and we'll miss the last release before Christmas. It should come out in January.

bmeldal commented 7 years ago

And can you please let me know when you had a response from UniProt. It will take a bit of time so might not filter through before I make the complex but that doesn't matter.

rjdodson commented 7 years ago

Hi Birgitt:

Just tried adding this at TermGenie, but got the following error:

Could not commit, the user is not authorized to execute a commit.

I am logged in and have used Term Genie before w/o receiving this error. But I haven't used this tool recently. Any suggestions as to what the problem might be? Should I just submit this request as a NTR here?

Thanks, Bob

bmeldal commented 7 years ago

Bob, did you give the developers your GH ID to add to the allowed users of TG? Mine was initially missing from the new list (even though I have been using GH here so my ID was known to the GO...!). There was an email to one of the GO lists. It was regarding the Noctua transfer to GH but I checked with Seth and the same should have applied to TG users so he added mine.

TG looked 'normal' this morning but I haven't got anything to commit right now (other than re-trying with your term) so I can't check if I get the same error. If it persists we have to ask the developers, there is another ticket for the transition work of the TG login process from Persona to GH.

I'm off for our Xmas lunch at 11.30 UK time (in <30min), then school pickup so won't be able to check progress until later this afternoon (small-people-induced tantrums permitting!).

Birgit

mcourtot commented 7 years ago

Hi @rjdodson - I added a comment reflecting this to the current TG login ticket. Could you clarify how did you login into TermGenie? The system was recently migrated to github, and you didn't have a GH account recorded (just added this to our metadata for you)

rjdodson commented 7 years ago

Hi Birgitt:

I suspect that's the problem. I remember the email, but did not reply as I thought that was specific to Noctua users (which I also need to sign up for I think). Anyway, I'll see if I can find the email and reply so that I can become one of the allowed users.

Enjoy your holiday lunch and school-pickup, and I hope there will be no tantrums today. Our kids get off the school bus at 5:00 pm (Madrid time), so I have some time to work in peace!

Thanks, Bob

rjdodson commented 7 years ago

Hi Melanie:

I logged in with my user name: rjdodson and then my password. I did this yesterday, and found that I was still logged in this morning. Now looking back through emails from Seth to reply to that one. Or any other suggestions?

Thanks, Bob

bmeldal commented 7 years ago

Leaving the login issues to @mcourtot. Hopefully you are all sorted by the time I get back :)

mcourtot commented 7 years ago

Hi @rjdodson: I suspect that as you didn't have an associated GH ID, the system reverted to trying to log you in using Persona (cc @nathandunn if he can confirm there is such a fallback?) I added your GH in the users file - if I remember correctly the server can take up to 6h to have this take effect. I'd suggest to try again later or tomorrow if that's ok. Sorry for the inconvenience, on the bright side this is very helpful to make sure we are migrating the TG login system properly, so thanks for all the feedback!

rjdodson commented 7 years ago

OK, thanks Melanie. In the meantime, I found the email from Seth and just registered to be a Noctua user. Just letting you know in case my having done so in some way interferes with what you just did.

What about passwords, will my old one still be valid?

Thanks, Bob

mcourtot commented 7 years ago

Hi Bob,

The new system uses your Github account to authenticate you - so you will be asked to login into Github - attached how it looks like on my side.

screen shot 2016-12-01 at 11 37 05

I fill in my Github username and password and get access to TermGenie. So no more Persona username/password.

Noctua and TermGenie both read the same users.yaml file. it seems you were already registered as Noctua user: https://github.com/geneontology/go-site/blob/master/metadata/users.yaml#L359 reads

  authorizations:
    noctua-go:
      allow-edit: true
    termgenie-go:
      allow-write: true

So no worries, there shouldn't be any conflict.

I'll write some documentation.

rjdodson commented 7 years ago

Thanks Melanie, I'll let you know later on if I have any problems logging in.

nathandunn commented 7 years ago

Good reminder that I should move the ownership app token to someone other than myself. @kltm @cmungall should it go to bebop, geneontology, something else?

kltm commented 7 years ago

You can indeed have org-level OAuth2 apps/tokens (also a note for @DoctorBud there). https://github.com/organizations/geneontology/settings/applications I'd go ahead and re-deploy at some point with a new app registered there.

rjdodson commented 7 years ago

Hi Birgitt:

Tried again to create this term and received a different error, pasted below:

Just to make sure I'm doing this properly, here is what I did. Maybe the error is caused by incorrect input on my part? If so please let me know. Thanks, Bob

Using TermGenie I selected protein_complex_by_activity Selected Use template For required activity I added: Ras guanyl-nucleotide exchange factor activity For Literature reference I added: PMID:20493808 For DefX_Ref I added: GOC:rjd Selected Verify Input Selected the box to the left of the term and the Submit for Review box Selected Submit

I then received the following error:

2016-12-02 09:53:45 CommitTerms service call failed Error: jsonrpc error[-32000] : Caused by java.lang.NullPointerException at org.obolibrary.oboformat.writer.OBOFormatWriter.write(OBOFormatWriter.java:634) at org.obolibrary.oboformat.writer.OBOFormatWriter.write(OBOFormatWriter.java:363) at org.bbop.termgenie.ontology.obo.OboWriterTools.writeFrame(OboWriterTools.java:52) at org.bbop.termgenie.ontology.CommitHistoryTools.create(CommitHistoryTools.java:56) at org.bbop.termgenie.ontology.CommitHistoryTools.translateTerms(CommitHistoryTools.java:137) at org.bbop.termgenie.ontology.CommitHistoryTools.create(CommitHistoryTools.java:122) at org.bbop.termgenie.ontology.OntologyCommitReviewPipeline.commit(OntologyCommitReviewPipeline.java:87) at org.bbop.termgenie.services.DefaultTermCommitServiceImpl$CommitTask.runSimple(DefaultTermCommitServiceImpl.java:426) at org.bbop.termgenie.ontology.OntologyIdManager$OntologyIdManagerTask.run(OntologyIdManager.java:62) at org.bbop.termgenie.ontology.OntologyIdManager$OntologyIdManagerTask.run(OntologyIdManager.java:58) at org.bbop.termgenie.core.management.GenericTaskManager.runManagedTask(GenericTaskManager.java:295) at org.bbop.termgenie.services.DefaultTermCommitServiceImpl.commitTerms(DefaultTermCommitServiceImpl.java:188) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.json.rpc.server.InjectingJsonRpcExecutor.executeMethod(InjectingJsonRpcExecutor.java:264) at org.json.rpc.server.InjectingJsonRpcExecutor.execute(InjectingJsonRpcExecutor.java:171) at org.bbop.termgenie.servlets.TermGenieJsonRPCServlet.doPost(TermGenieJsonRPCServlet.java:43) at javax.servlet.http.HttpServlet.service(HttpServlet.java:648) at javax.servlet.http.HttpServlet.service(HttpServlet.java:729) at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119) at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133) at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130) at com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) at org.eclipse.jetty.server.Server.handle(Server.java:499) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) at java.lang.Thread.run(Thread.java:745)

bmeldal commented 7 years ago

Input looks fine. I think it's a server error. One of the developers will look at it, they are now tagged on the ticket so should see it :)

Birgit

bmeldal commented 7 years ago

@mcourtot emailed me, she's looking at it but the main developers are sitting at the West Coast so might take a few hours before they can check it.

mcourtot commented 7 years ago

Ticket is at https://github.com/geneontology/termgenie/issues/102

rjdodson commented 7 years ago

OK, thanks Birgitt and Melanie, I'll check back on this later today then.

rjdodson commented 7 years ago

Hi, just checking on this today. I added all terms as before and this seems to have worked.

The following terms have been created:

ID: GO:1905742 Label: Ras guanyl-nucleotide exchange factor complex

So at this point I just wait until the term becomes available. Thanks for all of your help! Bob

mcourtot commented 7 years ago

Fantastic, thanks for the update @rjdodson!