Open pgaudet opened 5 years ago
I want to see what the violations look like but I can't find the organism-specific errors (again)
My bookmarks are to here http://release.geneontology.org/ and here
from which I can find the intersection rules https://github.com/geneontology/go-site/blob/master/metadata/rules/README.md#gorule0000009
and I can see this report, and the failures
but this isn't what the curators will see is it? they will get a specific link for their species won't they? I can't find this link anywhere (or where to go for the organism-specific lists)
This is my draft text, but I'd like to check that this makes sense in the context of the report.
~The “Matrix project” uses a set of QC rules generated using co-annotation and biological knowledge. Rules are created if two GO terms are usually never observed to annotate the same gene product simultaneously, after assessing the presence or absence of annotations across a set of evolutionarily diverse species (pombe, cerevisiae, worm, mouse). Violating gene products violating these rules are reported. The curator should look at the gene product’s annotations to both terms, and assess which annotation is in error OR add a “rule challenge” to the Annotation tracker to refine the rule accordingly https://github.com/geneontology/go-annotation For more background information on rule building see https://www.slideshare.net/ValerieWood/copy-of-biocuration-2017~
See revisions below from @mah11
Are we all referring to the same thing here? Note the resources: https://github.com/geneontology/shared-annotation-check/ and, for example: http://release.geneontology.org/2018-12-01/reports/shared-annotation-check.html
I found this page, http://release.geneontology.org/2018-12-01/reports/shared-annotation-check.html but it isn't organism specific.... What do people see in their organism taxon checks? That is what I can't find...
Organism-specific taxon checks are still in development with @dougli1sqrd and @balhoff . I believe that we do have something though, provided by the old owltools. @dougli1sqrd , is that correct, or have those been shuffled off?
@kltm - Val is asking whether there are versions of the shared-annotation-check report split out into one page/report per species or contributor, as the gorule checks, predictions, etc. are ("taxon checks" in https://github.com/geneontology/go-site/issues/942#issuecomment-447413725 was a mistake). If not (and it isn't on the to-do list already), one of us should open a ticket requesting this, because it will be a lot more convenient for annotators.
@pgaudet - I've edited the text Val suggested:
The "Matrix" produces annotation QC reports using a set of rules based on observed patterns of biological process term co-annotation, combined with additional biological knowledge. Rules are created if two GO terms are rarely or never used to annotate the same gene product simultaneously, and after assessing the presence or absence of annotations across a set of evolutionarily diverse species (fission yeast, budding yeast, worm, mouse).
Annotations violating these rules are reported. add link(s) to report location(s) here For each reported gene product, the curator should look at both annotated terms, and assess which annotation is in error. If both are correct, open a ticket on the Annotation tracker to refine the rule accordingly (choose labels "Matrix" and "annotation rule").
For more background information on rule building see https://www.slideshare.net/ValerieWood/copy-of-biocuration-2017.
@mah11 I talked to Val about this a little bit, but we never made a ticket. https://github.com/geneontology/shared-annotation-check/issues Minimally, we would bee a species/resource list to divide by. We would probably end up putting it under "pipeline" as a project.
@kltm RE Owltools having taxon checks, yes owltools still runs and still reports taxon checks. For example http://current.geneontology.org/reports/aspgd-report.html#otc shows rule violations for GO_AR:0000013 which is the owltools taxon checks. (This example isn't showing taxon violations precisely, but when checking this rule owltools couldn't find the taxon class, so it's erroring here)
Yes Midori is correct, I want to see these in species-checks.
Also, despite being told a number of times, and book marking the correct place, for some reason I can't find the place to look for the species checks. I, therefore, think others might find this quite challenging. I think this is what Seth is referring too. If people can't find these files, the annotation aren't going to get fixed, so this should be high priority (among all the other high priorities).
Everyone/each resource should also get a periodic reminder link to fix broken rules. if this happens I haven't seen it...
@kltm
I talked to Val about this a little bit, but we never made a ticket. ... Minimally, we would bee a species/resource list to divide by.
OK, I've opened https://github.com/geneontology/shared-annotation-check/issues/2
I will try and summarize. I changed the ticket title and I suggest we use "Shared Annotation Matrix" to avoid confusion with any other matrices.
I would like to move things forward and to include shared annotation checks be part of the standard go-rules checks. However, we will have to prioritize this - either on a managers call or at the meeting.
But for now, the scope of this ticket is to add documentation. Val's draft text is good. So the action is for @kltm to either embed or link to this text. Depending on other things we may or may not make this before the meeting.
Use @mah11 revised text below mine.
Matrix: Add more explanation text to the matrix rule check data html page so curators know what they’re actually looking at on this page.
@ValWood Can you provide some text ?