Open vanaukenk opened 2 years ago
@tmushayahama how is that list created? (What service does it call to get it?) 'pp1ab Scov2' does not look like a taxon at least in the latest NEO file.
@balhoff @tmushayahama uses the taxon API from minerva (/taxa)
As a hint, noting that the /taxa API is returning:
{ id: "http://identifiers.org/uniprot/P0DTD1", label: "pp1ab Scov2" }
Noting this found in neo.obo:
[Term]
id: UniProtKB:P0DTD1-PRO_0000449619
name: nsp1 Scov2
synonym: "nsp1" BROAD []
synonym: "P0DTD1-PRO_0000449619" RELATED []
synonym: "protein" RELATED []
is_a: CHEBI:33695
relationship: has_gene_template PR:000050270%7CUniProtKB%3AP0DTD1-PRO_0000449635%7CPRO_0000449635
relationship: in_taxon UniProtKB:P0DTD1 ! pp1ab Scov2
property_value: https://w3id.org/biolink/vocab/category https://w3id.org/biolink/vocab/GeneProduct
property_value: https://w3id.org/biolink/vocab/category https://w3id.org/biolink/vocab/MacromolecularMachine
I don't believe in_taxon is supposed to work like that.
It looks like the taxon is off by one for GPI 1.2?
UniProtKB P0DTD1-PRO_0000449619 nsp1 Host translation inhibitor nsp1|P0DTD1(1-180)|rep/Clv:nsp1 (SARS2)|PRO_0000449619|nsp1 (SARS2)|UniProtKB:P0DTD1, 1-180|leader protein (SARS2)|UniProtKB:P0DTC1, 1-180|non-structural protein 1 (SARS2)|nsp-1|ns1|ns-1|host translation inhibitor nsp1|Severe acute respiratory syndrome (SARS) coronavirus nonstructural protein 1 protein taxon:2697049 UniProtKB:P0DTD1 PR:000050270|UniProtKB:P0DTD1-PRO_0000449635|PRO_0000449635
http://geneontology.org/docs/gene-product-information-gpi-format/
@kltm it seems like you found the problem. But in the neo.owl I downloaded yesterday I saw in_taxon NCBITaxon:2697049
. I wonder why the discrepancy?
@balhoff Yeah, there's some stuff I'm not sure about here, especially as that file has not been touched in years, so I'm not sure why it's a problem now. I'm tagging upstream contributors @cmungall and @justaddcoffee to confirm format for GPI 1.2.
From @cmungall , we can go ahead and manually fix this file ourselves upstream.
If we understand this correctly, this should be fixed on next NEO release.
Hm. Apparently not. Still appearing on Noctua landing page.
During the QC checks for bringing Noctua up after the 2022-05-26 outage, I noticed a suspicious entry, pp1ab Scov2, in the list of species:
I thought pp1ab was a polyprotein and that's how it looks in noctua-amigo:
@balhoff @tmushayahama - can you take a look to see why this entry is included as a species? Thanks.
Also tagging @kltm