Closed DeniseSl22 closed 1 year ago
URI should be (for HGNC Accession Number): https://www.genenames.org/data/gene-symbol-report/#!/hgnc_id/HGNC:13890 . For HGNC symbol, URIs on identifiers.org don't function anymore... @egonw
Yes, we have a problem here. The URL for symbols just don't work. I'll have to remove it.
But, that would mean that all GeneProducts+Proteins annotated with HGNC symbol, will not have a working linkout on the WP website... Should we write a bot, to convert them to HGNC IDs (existing ones and in the future)? @mkutmon @AlexanderPico ?
any update on this issue? @DeniseSl22
Nope, we should make updates to all GPMLs with HGNC symbols in them, to get the linkouts working again. Also, what would be good to change in PV4, is not showing HGNC symbols anymore when looking for an identifier @mkutmon . And we could write a unit test for HGNC symbols in the future, so we can curate them later @egonw .
We can try to fix it in the hackathon planned for Feb
I was just reading the whole threat. I think it makes perfect sense for people to use HGNC symbols and not ID's. Actually conceptually HGNC symbols were meant to be the thing use d by the community as they are both unique, can be resolved and are both human (add meaning) and machine readable. HGNC thus is a special case where we want to allow people to use both symbols and IDs. If you want to automate anything it would be more the other way around. Like have a quick (hover?) lookup where you can find what symbol is meant if only the ID is there.
@Chris-Evelo : that's the thing, the symbols are no longer resolvable... Through BridgeDb, we can retrieve the HGNC symbol based on any other ID, but if the HGNC symbol is the ID used for annotation, the linkouts don't work anymore.
so we can curate them later @egonw .
Step 1:
[ERROR] Failures:
[ERROR] Genes.numericHGNCIDs:73->JUnitTests.performAssertions:24 Found integer HGNC symbols (did you mean 'HGNC Accession number'?): 4. Details:
http://www.wikipathways.org/instance/WP4919_r123490 AKT has 391
http://www.wikipathways.org/instance/WP5130_r123523 TCRA has 12027
http://www.wikipathways.org/instance/WP5130_r123523 TCRB has 12155
http://www.wikipathways.org/instance/WP288_r118398 MAPK has 651
==> expected: <0> but was: <4>
OK, I do understand now, and of course, we want them to be resolved. So is that an identifiers.org issue? n2t.net actually resolves the CURIE. I just tried n2t.net/hgnc:septin1 and that resolves fine. Of course, we could also solve it by converting any hgnc symbols to hgnc IDs on the fly, before we do the linkout. That would still be inline with the philosophy of the human gene name consortium is that human users should not have to know or ever see the identifiers.
Step 2 was to test if we had non-numeric HGNC Accession numbers
which we do not seem to have. Not many anyway: https://bit.ly/3H1QUhZ
We can try to fix it in the hackathon planned for Feb
@tabbassidaloii, great idea! It seems to me what we are missing in the BridgeDb mapping files are mappings between the HGNC symbol
s and HGNC Accession Number
s. Correct?
Now, to use any such work, we typically always need a new PathVisio release too, tho the webservices and therefore WikiPathways, but also the RDF generation can still use a recent BridgeDb directly.
We can try to fix it in the hackathon planned for Feb
@tabbassidaloii, great idea! It seems to me what we are missing in the BridgeDb mapping files are mappings between the
HGNC symbol
s andHGNC Accession Number
s. Correct?
I am not sure if that is correct. But I look into it.
Fixed. But I still like to see those mappings.
See also bridgedb/datasources#6 ; HGNC linkout on WikiPathways website, and PathVisio don't work...
Result: