geneontology / pipeline

Declarative pipeline for the Gene Ontology.
https://build.geneontology.org/job/geneontology/job/pipeline/
BSD 3-Clause "New" or "Revised" License
5 stars 5 forks source link

Main pipelines failing on reductions from the addition of gorule 64 #367

Closed kltm closed 3 months ago

kltm commented 3 months ago

Pipeline "sanity" failure on severe reductions in: goa_pig_isoform goa_chicken_isoform

70976 removed in chicken

like: ERROR - Violates GO Rule: GORULE:0000064: TreeGrafter ('GO_REF:0000118') IEAs should be filtered for GO reference species -- UniProtKB A0A023PS12 BG6 enables GO:0005102 GO_REF:0000118 IEA PANTHER:PTHR24100:SF140 F Ig-like domain-containing protein BG6|V-BG protein taxon:9031 20240201 TreeGrafter UniProtKB:A0A023PS12

Need to clarify with @pgaudet , as GORULE:0000064 is intentional, whether the correct course of action is to relax sanity rules for these files?

pgaudet commented 3 months ago

@kltm What is the comparison? It should be with the last release, so I am not sure where this drop is coming from.

If the drop is completely (or mostly) caused by Tree Grafter, please do not relax the check, we dont want to include these annotations right now.

The current release has 131441 chicken annotations; 65k IEAs, 55k IBAs. How many total IEAs would you have without the TreeGrafter filter?

Thanks, Pascale

kltm commented 3 months ago

@pgaudet The reports pretty much give the info up to a point (http://skyhook.berkeleybop.org/snapshot/reports/gorule-report.html), but the specific numbers for these two files in particular are:

goa_pig_isoform 'total': 134727, 'skipped': 68429
goa_chicken_isoform 'total': 126307, 'skipped': 70976

over 50% in both cases.

pgaudet commented 3 months ago

According to the stats currently on snapshot: (copied here: https://docs.google.com/spreadsheets/d/1CSeFbKPP33vc0khz8GadqfLf5QqB-KcYc6WgBHKeDjQ/edit#gid=0), compared to the previous release, we gain 1364 annotations for Gallus gallus, for a total of 26668 annotations. Similarly for pig: we gain 629, for a total of 27045 annotations.

So, this is all good.

Thanks, Pascale

kltm commented 3 months ago

From discussion w/ @pgaudet , I'll relax the rules for these two files to proceed.

kltm commented 3 months ago

Moving forward with trying this on snapshot.

kltm commented 3 months ago

The second stage of snapshot has progressed beyond sanity, so we're good here.