Closed ValWood closed 7 months ago
give IBA precedence over IEA for japonicus.
So different to pombe?
Here's the priority list we use for PomBase: https://github.com/pombase/pombase-chado/issues/695#issuecomment-709565296
I dropped the ball on that too. We can switch the order for pombe. I made a ticket https://github.com/pombase/pombase-chado/issues/1112
For PomBase we load PANTHER annotations from: http://snapshot.geneontology.org/annotations/pombase.gaf.gz (I think) but the japonicus GAF (http://snapshot.geneontology.org/annotations/japonicusdb.gaf.gz) file doesn't seem to contain any annotations that look like they come from PANTHER.
Am I looking in the right place?
Hmm, I don't know. I know they exist because I can see them in GOA. @pgaudet do you know why the japonicus IBA's would not be here, but are in GOA?
Not sure why they are not being exported.
@kltm do you know ?
Hmm, I don't know. I know they exist because I can see them in GOA.
I checked the most recent GOA GAF file and there are 10878 japonicus annotations that look like they are from PANTHER. Should we load those?
Where ? I dont see IBAs here http://snapshot.geneontology.org/annotations/japonicusdb.gaf.gz maybe you have a different file?
Ah !! great ! I am not involved with the GOA pipeline, so I didn't expect to check there.
If this meets your imemdiate need great ! I'll still check why GO central doesn't have these annotations.
@kimrutherford are the pombe ones in the equivalent snapshot file?
If they disappear we might not notice because we would (presumably) get the PAINT annotations from the GOA load as a fall back?
are the pombe ones in the equivalent snapshot file?
Yep!
We load them from the snapshot rather than the GOA GAF so they are more up to date. (I think that's the reason)
If they disappear we might not notice because we would (presumably) get the PAINT annotations from the GOA load as a fall back?
Currently we don't load the PAINT annotation from the GOA file so if they disappear from the snapshot file we wouldn't have any PAINT annotations.
OK then we would notice! Will wait for @kltm for why the japonicus are missing.
We are talking about PAINT-generated IBAs, correct?
I'd note that they were never added as a PAINT species during any update. See current sets in http://data.pantherdb.org/ftp/downloads/paint/17.0/2023-06-05/presubmission/ and lack of an entry in go-site metadata paint.yaml
. That said, they do occur in files loaded into AmiGO, namely paint_other.gaf
; we have:
sjcarbon@moiraine:/tmp$:) zgrep "taxon:4897" paint_other.gaf.gz | wc -l
0
sjcarbon@moiraine:/tmp$:) zgrep "taxon:402676" paint_other.gaf.gz | wc -l
11227
Looking around in AmiGO, we have: https://amigo.geneontology.org/amigo/search/annotation?q=SJAG_02056&sfq=document_category:%22annotation%22 and more generally: https://amigo.geneontology.org/amigo/search/annotation?q=*:*&fq=taxon_subset_closure_label:%22Schizosaccharomyces%20japonicus%20yFS275%22&
There seems to be no taxon overlap with what we are getting from JaponicusDB directly from their GAF: https://amigo.geneontology.org/amigo/search/annotation?q=*:*&fq=taxon_subset_closure_label:%22Schizosaccharomyces%20japonicus%22&sfq=document_category:%22annotation%22
Essentially, PAINT here is taxon:402676 and the JaponicusDB GAF is taxon:4897. Perhaps this is the source of confusion?
Tagging on @dustine32 just in case
The species taxon ID is 4897 https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=4897
402676 is a strain ID https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=402676
For a while NCBI added strain IDs. They no longer do this. I think the Panther/PAINT annotation should probably migrate to the species level (We annotate GO at the species level, not the strain).
Who do we need to inform, @huaiyumi ?
As @kltm pointed out, PAINT is releasing the IBAs for japonicus in paint_other.gaf
rather than in its own paint_japonicus.gaf
file. PomBase could extract these japonicus IBAs out of paint_other
or we may consider creating the separate paint_japonicus.gaf
along with the appropriate go-site/metadata/datasets/paint.yaml entry containing a merges_into: japonicusdb
property.
The species taxon vs strain taxon conversation came up for pombe a while ago but I can't find the issue(s) or email thread with this discussion, unfortunately. PANTHER/PAINT uses strain ID 402676 for japonicus simply because that is the taxon that's tied to the Reference Proteome sequence data we use to build the trees. @huaiyumi Do you recall this discussion?
Is it possible to switch it to the species taxon? It would make more sense to be consistent (and I think for most orgs we use the species?)
I wonder if this has something to do with the Reference Proteome. We got the S. japonicus data from them. It is probably not a simple switch of taxon ID. The sequence IDs maybe different under these two difference taxons. I guess the best way to fix this is to work with the Reference Proteome.
GO annotation should be at the species level though right @pgaudet ?
@ValWood Did Maria answer the question?
This is not really a GO problem, but it's to Reference Proteomes and Japonicus to agree on the reference strain.
I'm guessing it will be resolved by PANTHER mapping the annotations over from the strain ID to the species ID. val
Japonicus is now included in PANTHER (this probably happened a while ago, I hadn't checked).
This means that S. japonicus is now a species which is "PAINTED" and supplies experimenta annotations for transfer from the PAINT project.
There are now 10, 878 PAINTED GO annotations for S. japonicus https://www.ebi.ac.uk/QuickGO/annotations?taxonId=4897&taxonUsage=descendants&evidenceCode=ECO:0000318&evidenceCodeUsage=descendants
Most, if not all of these should be covered by our IEA pipeline. However direct transfer from fission yeast with an IEA (electronic) evidence code loses some of the provenance and isn't as good as an IBA (inferred from biological ancestor) annotations.
So, could we, a) import the IBAs for S. japonicus and b) give IBA precedence over IEA for japonicus.
(I think we said in the paper that we planned to do this)
CC @snezhkaoliferenko
also congrats to you and Gugs on your "amazeballs" Nat Comm. peroxisomal compartmentation paper!