geneontology / go-annotation

This repository hosts the tracker for issues pertaining to GO annotations.
BSD 3-Clause "New" or "Revised" License
34 stars 10 forks source link

Automatic reports of ND-evidenced annotations #2045

Closed pgarmiri closed 8 months ago

pgarmiri commented 6 years ago

Opening this ticket as a platform for discussion as suggest by @vanaukenk.

Following discussions within the GOA team, it has been decided that we will implement a procedure that will run periodically and automatically delete ND-evidenced annotations from internal (P2G-using) sources when annotations with evidence codes from the following set exist to GO terms in the same GO aspect:

    ECO:0000269 (EXP) and its descendants

    ECO:0006056 (HTP) and its descendants

    ECO:0000250 (ISS) and its descendants

    ECO:0000317 (IGC) and its descendants

A lot of discussion has been generated around this at the goa_curators@ebi.ac.uk email list. Some points that were raised were how ND evidence code is used differently by different groups and the perception of protein interaction data. Some further points on the perception of IEP evidence code and the ‘response to …’ terms are in the ticket from Val (https://github.com/geneontology/go-annotation/issues/2025)

pgaudet commented 6 years ago

Hi @pgarmiri

Is it possible to hold off the implementation until there has been more discussion ? That would be uch appreciated.

Thanks, Pascale

pgarmiri commented 6 years ago

Hi @pgaudet ,

Of course! We are not planning to implement this as yet. We would like to reach to a common agreement.

I open this ticket as Kimberly suggested, so as to have all the discussion going on in one place.

We initiated this as certain groups are already doing something similar at their databases. We also kept finding cases that had ND annotations at the same time as EXP annotations from as far as 10 years ago!

Thanks, Penelope

ValWood commented 6 years ago

We also kept finding cases that had ND annotations at the same time as EXP annotations from as far as 10 years ago!

the simple solution in this case is to train all curators to delete the ND annotation when they make an experimental annotation....then everything would work just fine...

pgarmiri commented 6 years ago

Hi @ValWood , yes, that could work (and in most cases it does work already) if the ND annotations is from the same group that adds the experimental annotations. Otherwise, if the ND was added from a different group, curators can't edit / delete them.

ValWood commented 6 years ago

Could this be relaxed in protein2GO? I can't see any valid objections from contributing groups that their ND annotation is deleted when another group curates experimental information? Or a dispute raised for the contributor to remove, in the short term?

pgaudet commented 6 years ago

Or people should at least take the habit of raising a dispute in those cases.

RLovering commented 6 years ago

I would be happy with automatic deletion as this would ensure a consistent approach to the application of ND was applied. However, I am still not convinced that 'protein binding' as an annotation is sufficient to justify removal of the MF ND annotation and would therefore appreciate it if a full discussion could be held to ensure an agreed approach.

Ruth

tonysawfordebi commented 6 years ago

@ValWood We can't automatically delete ND annotations from external (non-P2G-using) groups, because we do a complete refresh of annotations from all such groups every night; the ND annotations would have to be removed at source in order to ensure that they stay deleted.

Also it's no longer possible to dispute annotations from external groups within P2G; we removed that functionality some time ago as a result of a) several misunderstandings about how it worked, and b) a total lack of response to dispute emails from some of the external groups. Because of that we've restricted the dispute function to working only on annotations that live in the GOA database. (It is still possible, however, to send an email about external annotations to the registered contact address for the group from within P2G, but that is treated as a "fire & forget" operation, and no record is kept of any correspondence.)

ValWood commented 6 years ago

I'm not advocating automatic deletion, but that the constraint that only the group who made the annotation in protein2GO can remove it, should be lifted.

Idelaly, any curator should be able to delete an ND annotation from anywhere else. You have to admit that this makes sense ;)

pgaudet commented 6 years ago

@tonysawfordebi People should use GH for the disputes they cannot do in P2GO.

tonysawfordebi commented 6 years ago

@ValWood What I'm saying is that if the annotation comes from an external group, then even if we relaxed the restriction in P2G and allowed anybody to delete it, that would actually serve no purpose because it hasn't been deleted in the source database, and would therefore reappear the next time the annotations from that group were imported. Unless we massively re-engineered the import pipeline, which would be carrying things too far.

@pgaudet Yep, exactly. (Although that does require that every group needs to have someone monitoring GH issues.)

ValWood commented 6 years ago

Ah OK, but the majority of databases are using Protein2GO, OR the annotations are no longer maintained (JCVI TIGR). At the moment the QC group has jurisdiction over defunct annotation, but if others agree I see no reason why this could not extend to everyone for the removal of ND.

For all annotating groups, but especially for groups who are active and not using Protein2GO (AspGD, CGD, PomBase?) there should be a way to inform (via GO rules @cmungall )? that an annotation has appeared for a gene product with a "ND". Most extant databases hopefully have internal checks for this anyway.....

It seems that this check needs to happen upstream of GOA stripping the annotations out because as you say, unless the information is fed back a) the ND is going to reappear and GOA will continue to strip it out and b) this will lead to (even more) inconsistencies between GOA and the GO database.

tonysawfordebi commented 6 years ago

Yes; these things need to be removed at source.

ValWood commented 6 years ago

Hi Suzi,

I think initially an alert would be good for this based on all of the discussion.

I can see a situation where PomBAse makes an ND but a mapping appears which might be incorrect, or very high level. PomBase do not use high level terms that are uninformative about the process, mapping pipelines might. If we have the opportunity to remove the annotation manually based on the GO logs we would pick up and report any errors in the mapping pipelines at this point.

v

selewis commented 6 years ago

Agreed, keep notification at the alert level (at least for now, we can always revisit). Put this in the pipeline as a QC rule - @dougli1sqrd I can talk to you about it to explain exactly what is needed. I'll add a ticket on your tracker.

kltm commented 6 years ago

@selewis Could you remove our email addresses and untag random people from this ticket? Maybe delete the whole comment to get it out of the edit history. https://github.com/geneontology/go-annotation/issues/2045#issuecomment-411479767

selewis commented 6 years ago

done, hadn't realized gmail would do that, quite annoying.

pgaudet commented 6 years ago

This is the actual ticket where that rule will be implemented: https://github.com/geneontology/go-site/issues/775 We need to decide whether this will be a hard check (filter) or just a warning (report).

Thanks, Pascale

vanaukenk commented 6 years ago

After discussion at Montreal meeting:

1) Automatic deletion of ND-evidenced annotations when an EXP (or other) evidence code will not be allowed. 2) At the moment, for MF, groups may continue to have an annotation to root molecular_function as well as an annotation to GO:0005515 (or child) but we will flag those annotations (or continue to flag) for review. 3) Get the list of genes/gene products where GO:0005515 (or child) is the ONLY MF annotation to see if these functions can be described better with an existing or new term. 4) For genes/gene products where a better MF term is found, consider prioritizing those families for PAINT annotation so that we can leverage the new information across all organisms.

pgarmiri commented 6 years ago

Hi,

Sorry I missed this discussion. I have a couple of questions..

  1. The above decision (1) affects all the aspects or only the MF that has the controversy about annotations to GO:0005515 (or child)?

  2. If it is for all, will the description of ND evidence code and its usage be changed in the wiki page?

http://wiki.geneontology.org/index.php/No_biological_Data_available_(ND)_evidence_code

In particular the second point of the overview could read a bit confusing/misleading both to curators and users when they see ND annotation and EXP for the same aspect...Also, this point makes confusing the decision on when the ND annotation should be added.

"Use of the ND evidence code specifically indicates that a curator has looked but not been able to find information that supports making an annotation to any term from the Molecular Function, Biological Process, or Cellular Component as of the annotation date indicated. "

Thanks, Penelope

vanaukenk commented 5 years ago

@pgarmiri - we will be discussing this issue on today's call. Can you attend? Thx.

pgarmiri commented 5 years ago

@vanaukenk , yes, great, I will be there. Penelope

suzialeksander commented 8 months ago

Since deletion wasn't agreed on, I changed the title. This should be solved by https://github.com/geneontology/go-site/blob/master/metadata/rules/gorule-0000054.md, but that rule isn't in the reports. @pgaudet can you verify that if there are violations of GORULE*54, they would appear in the report? If so we're OK to close

pgaudet commented 8 months ago

Moving this as a GO-rule.

Note that this is a 'global check' (ie, not a line-by-line, it requires to look at other annotations), so we will not implement this is the very near future

pgaudet commented 8 months ago

The ticket exists here: https://github.com/geneontology/go-site/issues/914