PROconsortium / PRoteinOntology

Other
14 stars 3 forks source link

Externally linked urls #198

Closed Julie-Cowart closed 3 years ago

Julie-Cowart commented 4 years ago

The urls we link to externally on the entry pages are supposed to be provided by using the idspace headers in the obo file. The way it was actually implemented was a static file external_links.json. This is because some of the links were generated in javascript that doesn't have easy access to the DB. This is a problem because its a large list (there are 69 entries in the OBO) and we don't always keep this up to date when the obo changes and I found several missing (resulting in no link) and wrong (resulting in possibly old link) entries:

I have implemented the change to actually pull this from the DB (as loaded by the obo file). I went through and checked each idspace prefix once (and noted the changes above) but more testing should be done. It seems to work but there are some issues remaining:

  1. Araport urls don't resolve (e.g. https://proconsortium.org/app_test/entry/PR%3AP04778/ links to https://www.araport.org/locus/AT1G29930 which returns 404 error) so perhaps this should be changed in the obo file. araport.org claims it's no longer funded. Options are using links https://plants.ensembl.org/Arabidopsis_thaliana/Gene/Summary?g=AT1G29930 or https://www.arabidopsis.org/servlets/TairObject?type=locus&name=AT1G29930 or suppressing the links entirely (with an override list)
  2. DOID, GO, and NCBITaxon are now purls which redirect to ontobee. Is this preferred? We could keep an override list to fix just these cases to be the previous urls.
  3. EnsemblBacteria is currently giving a server error e.g. http://ensemblgenomes.org/id/SaurJH9_1189 but it is still indexed by google so is hopefully temporary
  4. HIstome_ptm and HIstome_var server http://www.actrec.gov.in/ seems to not respond. We could fix or leave depending on preference.
  5. PID in the OBO says the prefix is http://www.ndexbio.org/#/search?searchType=Networks&searchTermExpansion=false&searchString=cd8tcrpathway but this is wrong since the prefix should end at the last = since the rest is the one of the ids. This causes links to be wrong so need fixing ASAP
  6. SDG links were like http://www.yeastgenome.org/locus/S000001952 but according to the OBO is http://www.yeastgenome.org/cgi-bin/locus.fpl?dbid=S000001952 which does redirect to the other url but http://www.yeastgenome.org/locus/ is a cleaner idspace for the OBO so I would think we should change it
nataled commented 4 years ago

Pulling from the OBO-provided list is a good idea, especially since I thought this was happening all along. For your questions:

  1. Araport - I was informed about Araport late last week. Plant terms used to link to TAIR, but then Araport took over. For the moment, I changed the URL to the one used by UniProt: idspace: Araport https://bar.utoronto.ca/thalemine/portal.do?externalids=

Technically this goes to ThaleMine, not Araport, so we might need to change the prefix at some point. However, for now, we'll keep Araport, as that's the prefix used by UniProt also.

  1. DOID, GO, and NCBITaxon - Always use purls. It is up to the maintainers to provide the preferred destination.

  2. EnsemblBacteria - The issue was indeed temporary. Just tried it and it works.

  3. HIstome_ptm and HIstome_var - Seems the resource has moved. I revised the OBO file for these: idspace: HIstome_var http://www.iiserpune.ac.in/~coee/histome/variants.php?variant= idspace: HIstome_ptm http://www.iiserpune.ac.in/~coee/histome/ptm_sp.php?ptm_sp=

  4. PID - Fixed in OBO in the way you suggested.

  5. SDG - Fixed in OBO in the way you suggested.

Julie-Cowart commented 4 years ago

We will have to wait the next release for the OBO changes to be applied (even to the test site) so I will hold off on pushing to production until we can test during prerelease.

Julie-Cowart commented 3 years ago

Now that the next release is loaded on the test site, I have confirmed that the following are now working as expected Araport EnsemblBacteria PID SGD

However HIstome_ptm and HIstome_var does not seem to work. E.g. https://proconsortium.org/app_test/entry/PR%3A000045400/ references HIstome_var:H1.4 http://www.iiserpune.ac.in/~coee/histome/variants.php?variant=H1.4 which give errors. I used google and the correct url might be http://www.actrec.gov.in/histome2/Human/histone_info.php?Dtype=variants&prot=Histone_H1.4

Julie-Cowart commented 3 years ago

The HIstome_ptm links seem to not show because we don't link synonym sources (if we want to start doing that we should make a new issue). But https://proconsortium.org/app_test/entry/PR%3A000045400/ does have two synonyms and the links would be HIstome_ptm:H1K25me1 and HIstome_ptm:H1S26ph but the equivalent on the site listed above would be http://www.actrec.gov.in/histome2/Human/ptm_sp.php?ptm_sp=H1K25me1 and http://www.actrec.gov.in/histome2/Human/ptm_sp.php?ptm_sp=H1S26ph

nataled commented 3 years ago

No need for the synonym sources. More concerning is that these links do show under the definition sources, and they are the incorrect ones (which is quite annoying since I verified those links before implementing). I will make the change in the source file.

Julie-Cowart commented 3 years ago

Do I need to do another round of prerelease then?

nataled commented 3 years ago

Looking at the HIstome site, they made it so that each organism requires a different URL. I'll thus need to make further changes in the OBO file to ensure we have the flexibility to add other organisms (all are currently human).

nataled commented 3 years ago

With this release we removed the 'variant' HIstome links completely, and the 'ptm' links were never in place. Therefore, this issue is fully dealt with and can be closed. Further HIstome link issues require curation effort, and will be followed in a new ticket #218