geneontology / noctua-form-legacy

Simple annoton editor workbench for Noctua.
BSD 3-Clause "New" or "Revised" License
3 stars 3 forks source link

Root MF Node evidence #45

Closed krchristie closed 6 years ago

krchristie commented 6 years ago

The SAE should not allow me to save an annoton that doesn't have all of the appropriate evidence. It let me save this:

20180328-newannotonsae

I had entered the two BP rows first, and the save button was not active. Once I entered the "molecular_function" term into the MF section, the Save button became active even though I hadn't entered any reference info. Thus these BP annotations don't actually show up in the Annotation Preview due to lack of the evidence on the MF annotation.

model: http://68.181.125.145:8910/editor/graph/gomodel:5ab581e800000496

tmushayahama commented 6 years ago

@krchristie If it is a root node (molecular_function for MF, biological_process for BP or cellular_component for CC), currently SAE will loosen up and make the evidence not required. So in this case 'molecular_function (GO:0003674)' is a root node @pgaudet @vanaukenk @thomaspd. @ukemi I remember discussing about not putting evidence GO_REF:0000015 ("no biological data found used in manual assertion") if no evidence provided by the user. Should SAE enforce to put evidence if Root Node is provided?

krchristie commented 6 years ago

I do remember discussing that the SAE should allow putting in the root MF without filling in the evidence field, and that would be great. Actually, I thought that it was going to assume the same experimental evidence as for the other rows in the annoton (which thinking it through I realize might only work if all rows in the annoton had the same evidence, which is often but not always true). In previous discussions, @cmungall has argued in favor of having the root MF annotations in models like these (I'll have to leave it to him to restate that argument if needed). I thought the plan was that we would be able to distinguish root MF annotations made in the model only to hang other BP and/or CC annotations off of by giving them the experimental same evidence as their attached annotations from root MF annotations where the curator deliberately wanted to input a root MF annotation to indicate that they have comprehensively examined the literature and want to say that we do NOT know the function of this gene as of the annotation date. We absolute can NOT assume ND evidence unless the curator enters it specifically.

If the SAE is going to allow a MF with no evidence, then it needs to still generate the BP and/or CC annotations, not just in the graph editor, but also in a way that they will show up in the Annotation Preview and will be exported into the GPAD. Right now in this model, the annotations for Ift88 that I entered in the SAE using the Default form with a filled in MF but not evidence/reference info show up in the graph, but do NOT show up in the the Annotation Preview or the GPAD export.

I entered two annotons via the SAE in this model, using the Default form for the Ift88 gene and the Ift88 BP annotations don't show up in the Annotation Preview or GPAD due to not having evidence on the root MF term. I entered the Ift20 annotations using the CC only form and they show up in the Annotation Preview and GPAD, but not in the Table View in the SAE.

I would be all for not having to enter the MF root evidence info if I only want to make BP and/or CC annotations. Perhaps the suggestion in this ticket to make just a tick box in the MF section of the form to indicate "MF not known", would be more intuitive to curators: https://github.com/geneontology/simple-annoton-editor/issues/46

vanaukenk commented 6 years ago

A couple of thoughts on this:

1) In principle, I think we should always have evidence on an annotation statement, so we should not create MF root node annotations without evidence anywhere in Noctua.

2) For the cases where we need to create an MF root node annotation for completion (i.e. not the conventional ND type of MF), it makes most sense to me to use the same reference and evidence code as the associated BP annotation. When curation groups then consume the derived GPAD file, they can filter out any root node MF annotations that do NOT use ND and the associated ND reference, GO_REF:0000015.

@ukemi - is 2) consistent with the MGI parsing script for consuming the derived GPAD files?

For further discussion of what appears in the Table View of the SAE, let's continue on #43

ukemi commented 6 years ago

I agree. It does require some reeducation of curators in that if they only want to make an annotation to BP, they have to realize that this means the protein has some MF. The evidence on the anonymous MF should be the same as on the BP. These anonymous annotations should not be output to the GPAD. An annotation to only the root MF should follow our conventional requirements where the evidence is ND and a GO-Ref0000015. We are making the assumption that 'place-holder' annotations that differ from our conventional MF root annotations with ND will be filtered out of the file before we pick it up. I think @balhoff already has that in place.

vanaukenk commented 6 years ago

Okay, thanks @ukemi Yes, checking on other tickets, it looks like this is in place: https://github.com/geneontology/minerva/issues/154 https://github.com/geneontology/minerva/pull/130/files

balhoff commented 6 years ago

We are making the assumption that 'place-holder' annotations that differ from our conventional MF root annotations with ND will be filtered out of the file before we pick it up. I think @balhoff already has that in place.

@ukemi yes that's true, however if a more specific MF for that node can be inferred then it will be output.

ukemi commented 6 years ago

If a more specific annotation can be inferred, it should not have an ND evidence code. Do we have examples of that. We need to have a check where that won't happen.

vanaukenk commented 6 years ago

Yes, I'm trying to think of, in practice, when that would actually happen. Regardless, it seems that if that were to happen, it would need to be flagged in the tool so that the curator could check the annotation statements they made and make changes, if needed.

ukemi commented 6 years ago

The only time I can think of it happening is by mistake. For example if some of the terms in the MF refactor are defined as molecular functions that relate to a process, then they could be inferred. But then the evidence would be the same as that for the process. We need to educate curators about this.

vanaukenk commented 6 years ago

Yes, I agree; I think this would likely happen by mistake and thus the curator would need to revisit the model and make modifications.

krchristie commented 6 years ago

What the DEFAULT template is doing now is that it let me put in the root MF term without evidence and didn't assume the experimental evidence that was on the BP term. I thought we had agreed that if you didn't fill in the MF term (as in didn't even fill in the MF term blank), you would be allowed to save with the evidence assumed to be the same as that for the BP term.

What the CC ONLY template is doing is also inconsistent with what I thought we had agreed, in that it doesn't put in an anonymous MF at all.

I don't know if I would want any reasoning to infer MF from BP. The one other time I have seen that resulted in ontology changes.

vanaukenk commented 6 years ago

From call on 2018-04-11: the form should now require evidence on all assertions in a model before saving, even if the annotation is to the root node.

Proposal for the default form: root node evidence will either be ND and a GO_REF if there is no data for a molecular_function or experimental data from a BP annotation. Only root node annotations with ND and a GO_REF will be propagated to the derived GPAD output.
Preferred curation SOP, though, will be to either add an existing MF from the database or make an ND annotation.

tmushayahama commented 6 years ago

@vanaukenk @krchristie this issue has been resolved. However for easier future track, let's edit the title so it reflects "Root Node" evidence as per this above comment https://github.com/geneontology/simple-annoton-editor/issues/45#issuecomment-377034555