geneontology / noctua

Graph-based modeling environment for biology, including prototype editor and services
http://noctua.geneontology.org/
BSD 3-Clause "New" or "Revised" License
38 stars 12 forks source link

Some issues with SGD's complexes #924

Open kltm opened 4 weeks ago

kltm commented 4 weeks ago

Noting here there are still some issues with SGD's complexes:

https://release.geneontology.org/2024-09-08/annotations/sgd.gpi.gz has

SGD:S000218003 CPX-1852 RPD3L histone deacetylase complex|RPD3L complex|RPD3(L)|RPD3/SIN3 large histone deacetylase complex|3.5.1.98|8GA8|EMD-29892|8HPO|EMD-34935 protein_complex taxon:559292

https://release.geneontology.org/2024-09-08/products/upstream_and_raw_data/sgd-src.gpi.gz has:

SGD:S000218003 ASH1:CTI6:DEP1:PHO23:RPD3:RXT2:RXT3:SAP30:SDS3:SIN3:UME1:UME6 RPD3L histone deacetylase complex GO:0032991 taxon:559292 S000005274|S000005274|S000001346|S000001346|S000001346|S000005364|S000005364|S000005364|S000005364|S000000299|S000000299|S000000299|S000000299|S000000299|S000006060|S000006102|S000004876|S000004876|S000005041|S000005041|S000005041|S000002234|S000002234|S000000011|S000000011|S000000011 ComplexPortal:CPX-1852

To find this complex in Noctua, the only current way is to enter S000218003 or SGD:S000218003 in the Term box, where the entity pops up with as ASH1CTI6DEP1PHO23RPD3RXT2RXT3SAP30SDS3SIN3UME1UME6 Scer. Curators expect CPX-1852 to work but doesn't, although I've found that searching for ASH in the Term box also works, but that's not an obvious name and isn't in the GPI provided by SGD.

SGD is modifying the supplied GPI and the next available GPI from SGD will look more like the /annotations/sgd.gpi :

SGD:S000218003 CPX-1852 RPD3L histone deacetylase complex GO:0032991 taxon:559292 S000005274|S000005274|S000001346|S000001346|S000001346|S000005364|S000005364|S000005364|S000005364|S000000299|S000000299|S000000299|S000000299|S000000299|S000006060|S000006102|S000004876|S000004876|S000005041|S000005041|S000005041|S000002234|S000002234|S000000011|S000000011|S000000011 ComplexPortal:ASH1:CTI6:DEP1:PHO23:RPD3:RXT2:RXT3:SAP30:SDS3:SIN3:UME1:UME6

Strongly related ticket https://github.com/geneontology/noctua/issues/914

Originally posted by @suzialeksander in https://github.com/geneontology/noctua/issues/910#issuecomment-2364648262

kltm commented 4 weeks ago

While (finally) digging into this a little, it seems that it may no longer be an issue?

Tagging @vanaukenk @suzialeksander

vanaukenk commented 4 weeks ago

Checking the autocomplete for SGD complexes after tonight's outage, they seem to be there now.

Specifically checking CPX-1852: http://noctua.geneontology.org/workbench/noctua-form/?model_id=gomodel%3A671ae02600000000

@suzi can you please confirm that this is now working for the SGD curators?

@suzi - we also had a question about the ComplexPortal xref entry in the SGD gpi file and whether this is the correct format for a ComplexPortal accession or should it be something like ComplexPortal:CPX-1852? https://github.com/geneontology/go-site/blob/master/metadata/db-xrefs.yaml

suzialeksander commented 3 weeks ago

Checking the autocomplete for SGD complexes after tonight's outage, they seem to be there now. Specifically checking CPX-1852: http://noctua.geneontology.org/workbench/noctua-form/?model_id=gomodel%3A671ae02600000000 @suzi can you please confirm that this is now working for the SGD curators?

Yes, we see them! Fantastic!

@suzi - we also had a question about the ComplexPortal xref entry in the SGD gpi file and whether this is the correct format for a ComplexPortal accession or should it be something like ComplexPortal:CPX-1852? https://github.com/geneontology/go-site/blob/master/metadata/db-xrefs.yaml

From the GPI specs, col 2 is DB_Object_Symbol not DB:DB_Object_Symbol and It is not a unique identifier or an accession number, so that's what we supplied. Although I'd agree that ComplexPortal:CPX-1852 is the correct form of the identifier, curators would MUCH rather not need to put the extra prefix when looking. Does what we supply in the GPI show up in other places where it could lead to confusion?