Closed kltm closed 7 months ago
@kltm What is the comparison? It should be with the last release, so I am not sure where this drop is coming from.
If the drop is completely (or mostly) caused by Tree Grafter, please do not relax the check, we dont want to include these annotations right now.
The current release has 131441 chicken annotations; 65k IEAs, 55k IBAs. How many total IEAs would you have without the TreeGrafter filter?
Thanks, Pascale
@pgaudet The reports pretty much give the info up to a point (http://skyhook.berkeleybop.org/snapshot/reports/gorule-report.html), but the specific numbers for these two files in particular are:
goa_pig_isoform 'total': 134727, 'skipped': 68429
goa_chicken_isoform 'total': 126307, 'skipped': 70976
over 50% in both cases.
According to the stats currently on snapshot: (copied here: https://docs.google.com/spreadsheets/d/1CSeFbKPP33vc0khz8GadqfLf5QqB-KcYc6WgBHKeDjQ/edit#gid=0), compared to the previous release, we gain 1364 annotations for Gallus gallus, for a total of 26668 annotations. Similarly for pig: we gain 629, for a total of 27045 annotations.
So, this is all good.
Thanks, Pascale
From discussion w/ @pgaudet , I'll relax the rules for these two files to proceed.
Moving forward with trying this on snapshot
.
The second stage of snapshot has progressed beyond sanity, so we're good here.
Pipeline "sanity" failure on severe reductions in: goa_pig_isoform goa_chicken_isoform
70976 removed in chicken
like:
ERROR - Violates GO Rule: GORULE:0000064: TreeGrafter ('GO_REF:0000118') IEAs should be filtered for GO reference species -- UniProtKB A0A023PS12 BG6 enables GO:0005102 GO_REF:0000118 IEA PANTHER:PTHR24100:SF140 F Ig-like domain-containing protein BG6|V-BG protein taxon:9031 20240201 TreeGrafter UniProtKB:A0A023PS12
Need to clarify with @pgaudet , as GORULE:0000064 is intentional, whether the correct course of action is to relax sanity rules for these files?