Closed sierra-moxon closed 1 month ago
Thank you @sierra-moxon.It's nice, simple and it works for me. The cautionary note contrasting the "url" property with the "xref" property is very helpful.
@codewarrior2000 - thanks Larry! The thing I am slightly worried about is that for your reactome example, to fully represent both reactome links on a single node, you'd have one xref
property filled out (a CURIE) on the node and one url
property filled out (the full expanded URL) on the node. Should the guidance be that when url
is present on a node, any xref
annotation should also be expanded fully into a URL and stored in a url
property? I'm thinking of downstream consumers, like the UI, that will need special code to display one or the other.
I'm also a little worried that the definition of xref
currently allows URIs. So, technically, someone could use xref
for any URL/**URI*** that is related to the node or edge it is used on. Should we restrict the definition of xref
to just be a CURIE? If we do that, we may be out of sync with other databases and/or have refactoring to do.
Per our discussion, this property is also single-valued (assuming that in TRAPI, many such attributes will be submitted in the TRAPI message when necessary). How will that be stored in the data store? Perhaps this is unnecessary for me to know, but I am trying to imagine how a non-Translator user would use this slot if they need to represent more than one of these kinds of URLs for a particular node.
@sierra-moxon, thank you. I have been wondering why the meeting discussions had been concerned about two reactome links. The original intention was just to pass along the one URL that we found in the Reactome database, which we had called the "reaction_url", which links to the Reactome Pathway Browser. (e.g., https://reactome.org/PathwayBrowser/#/R-MMU-5655466)
Is there a Biolink Model requirement that I am not aware of that requires both links?
@sierra-moxon Sorry, Sierra, I had to talked it over with Vlado about the url property being single-valued though, yet there is a need to handle multiple URLs. Any user with multiple URLs for a node should know to present each URL as an individual node attribute. The technical requirement will be imposed on ARAs to recognize that there can be multiple node attributes of the same type (URL).
I'm also a little worried that the definition of xref currently allows URIs
@sierra-moxon Question. As the most conservative approach, what if we continue to let the xref property be used for both URL/URI and for CURIE? Has that broken anything in Translator yet?
thanks @codewarrior2000 and @vdancik!
TL;DR:
biolink:url
being single-valued is that it does not make sense outside of Translator technical architecture.url
is different in Biolink from an xref
, we have it available for use in other contexts besides reactome.xref
we likely could have nodes with one or the other xref
or url
with the intent to represent the same thing -- it will be inconsistent for the UI and difficult to choose between for folks using Biolink outside of Translator.biolink:url
being single-valued is that it does not make sense outside of Translator technical architecture.Sort of self-explanatory, but Biolink is used for KGs other than those currently in Translator and I think without TRAPI its difficult for a user of Biolink to use our proposed biolink:url
without it being multivalued.
url
is different in Biolink from an xref
and we have it available for use in other contexts besides reactome. For example, the same CURIE that represents a mouse gene can be used in many URLs to see different views of that gene at MGI:
https://www.informatics.jax.org/marker/MGI:97486 <-- the full gene page at MGI, the default URI expansion of the curie: MGI:97486
https://www.informatics.jax.org/gxd/marker/MGI:97486 <-- the gene expression information at MGI
http://www.informatics.jax.org/gxd/marker/MGI:97486?tab=imagestab <-- just the images of the gene expression at MGI
This is a very similar use case as the reactome use case. If we chose the biolink:id
for a mouse gene node to be the NCBIGene identifier (NCBIGene:18504, then we could include an xref
property on that node, MGI:97486
. Its default URI expansion would be: http://identifiers.org/mgi/MGI:97486
and this URL could be used to redirect the user to https://www.informatics.jax.org/marker/MGI:97486
But those other two MGI links are also valid, and take the user to a different view of the data. So similarly, we'd argue in this PR, that those two other MGI links are not biolink:xref
s, they are biolink:url
s. But, technically, someone could provide those three MGI URLs in the biolink:xref
field because we're allowing URIs
as well (some handwaving here between URI and URL, I do know that the URI is technically the default expansion of this CURIE, but I'm not confident that users will take the time to disambiguate).
xref
from url
so that new users not privy to this PR can decide which property to use to provide links to other sources with different views of the "same" data. It could be that we need a better description here: it was hard to clarify the distinction.biolink:xref
has these properties:
xrefs
are CURIEs (but Biolink also says they can be URIs) that are different from the CURIE in the biolink:id
field for the node. It is an alternative identifier for the node. xref
should be expandable to a URI using a prefix map.xref
is multivalued; multiple values in the xref
slot can be used to provide webpage link-outs to alternate views of the node at different databases/websites.xref
field, it must be the default expansion based on a unique prefix. Therefore, someone could provide something that technically we would consider to be a url
in the xref
field. biolink:url
or biolink:alternative_url
has these properties:
Perhaps we should require that if an xref
is provided on a node as a CURIE, it is expanded and added into a url
property as well. Similarly, if the xref
provided is a URI, then it should be duplicated into the url
property. This helps us with consistency for downstream consumption.
At a minimum I think we need text descriptions that help disambiguate and I would welcome help here, of course. :)
@sierra-moxon We appreciate the TL;DR. I will need some time to digest the implications of how xref and url properties coexist and interplay.
…the context of the existing xref property
for your review @codewarrior2000
fixes #https://github.com/biolink/biolink-model/issues/1486