Include dbxrefs in usage analysis

matentzn commented 3 years ago

see #29 for context. Thanks @lschriml

matentzn commented 3 years ago

@bpeters42 has reservations, see #29

alanruttenberg commented 3 years ago

I'm with Bjoern on this. I too often see dbxrefs for terms that are, or should be, part of an ontology and connected by a more informative relation. Dbxrefs also don't elucidate the connection between the referenced thing and the citing term. What exactly are the conditions that must hold for a dbxref to be included? What should a user understand when they see a dbxref?

Bottom line is that in my experience, more often than not, the presence of lots of dbxrefs is a signal that there's a problem. I'd almost say that if you included it in a metric, it should be negative.

Alan

On Fri, Jan 15, 2021 at 12:02 PM Nico Matentzoglu notifications@github.com wrote:

@bpeters42 https://github.com/bpeters42 has reservations, see #29 https://github.com/OBOFoundry/OBO-Dashboard/issues/29

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBO-Dashboard/issues/30#issuecomment-761061598, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB3CDWFZAPWGZZ7I2GBD63S2BYITANCNFSM4WA6TCFQ .

matentzn commented 3 years ago

Just to be clear, the question here is:

If O1 contains this statement:

A hasDbXref O2:B

should we count this is a "use" of O2:B, or ignore it? This is important to decide the overall impact of ontologies in OBO - many of which are referenced by xrefs, and only by xrefs (DO, HPO have many many xrefs for example).

bpeters42 commented 3 years ago

If O2 is outside of OBO, then dbxrefs are appropriate and useful. It also wouldn't count one way or the other If both O1 and O2 are in OBO, then (as Alan also said) the use of dbxrefs rather than actual re-use of terms is potentially a problem of overlapping scope. Ideally we would resolve those occurrences by having the teams work together. And to be clear, the problem there would seem to be with O1.

On Sat, Jan 16, 2021 at 2:30 PM Nico Matentzoglu notifications@github.com wrote:

Just to be clear, the question here is:

If O1 contains this statement:

A hasDbXref O2:B

should we count this is a "use" of O2:B, or ignore it? This is important to decide the overall impact of ontologies in OBO - many of which are referenced by xrefs, and only by xrefs (DO, HPO have many many xrefs for example).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBO-Dashboard/issues/30#issuecomment-761690196, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJX2IULVYCUNS7EOLOGODLS2IHRTANCNFSM4WA6TCFQ .

-- Bjoern Peters Professor La Jolla Institute for Immunology 9420 Athena Circle La Jolla, CA 92037, USA Tel: 858/752-6914 Fax: 858/752-6987 http://www.liai.org/pages/faculty-peters

matentzn commented 3 years ago

Sounds reasonable for the case where xrefs are used in the sense of "here is an equivalent term elsewhere" - what if they are used to map across species (MP-HP)? Or disease to phenotype? Or just as a fill-in for any relation, to just say: there is some association between O1:A and O2:B? I agree this is not the purpose for dbxrefs - its just what they were abused for because of the OBO format.. :) Of course, its better if we fix this thing with the xrefs once and for all and use the appropriate logical relations instead..

matentzn commented 3 years ago

This issue does have a bit of contention, so I would like to make another suggestion. @lschriml rightfully feels that excluding mappings from the usage score leads to a gross underrepresentation of 'impact' (which is what we are trying to approximate with usage). So I would suggest the following:

We introduce a new mapping_score.
This mapping score explicitly looks at oio:hasDbXref, and skos:*Match, for incoming mappings.
Mathematically, we allow this score to "boost" the overall OBO score, but not to penalise it: so I am thinking that say you have > 10,000 xrefs pointing at your ontologies, your OBO score gets pushed by 0.2. This is simple to understand, and simple to implement, and makes the situation more fair. IMO. Please let me know what you think!

This will only affect a tiny handful of ontologies like DO & HP that are often not re-used directly, but xref'd to, for example by MOD-Os (Model Organism Database Ontologies). I think this is a fair compromise!

lschriml commented 3 years ago

@matentzn - I like this proposal, will show re-usage via xrefs.

Regarding @bpeters42 and @alanruttenberg comments: Can we ask OBOF ontologies that are using xrefs to instead re-use terms/IDs?

Suggestions??

For example,[ looking in OLS, querying](https://www.ebi.ac.uk/ols/search?q=DOID%3A+%28MONDO%3AequivalentTo%29&ontology=mondo): DOID: (MONDO:equivalentTo)
        --> there are 9710 results
            - thus overlapping scope, (90%) of DO.

Cheers, Lynn

matentzn commented 3 years ago

Yeah there are many many ontologies cross referencing DO. I will develop this feature. I thinks it's appropriate!

alanruttenberg commented 3 years ago

I can't see any good reason why MONDO wouldn't just reuse DO. This looks like a failure of the collaboration principle. What's going on?

On Wed, May 5, 2021 at 10:03 AM lschriml @.***> wrote:

@matentzn https://github.com/matentzn - I like this proposal, will show re-usage via xrefs.

Regarding @bpeters42 https://github.com/bpeters42 and @alanruttenberg https://github.com/alanruttenberg comments: Can we ask OBOF ontologies that are using xrefs to instead re-use terms/IDs?

Suggestions??

For example, looking in OLS, querying: DOID: (MONDO:equivalentTo) --> there are 9710 results

thus overlapping scope, (90%) of DO.

Cheers, Lynn

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBO-Dashboard/issues/30#issuecomment-832713559, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB3CDWDXOMUVVSL65EOI2DTMFFZZANCNFSM4WA6TCFQ .

mellybelly commented 3 years ago

Mondo focuses on alignment across many sources, DO is one of them. This has blocked our ability to align diagnostics and disease definitions around the world. There are multiple papers explaining this problem: https://www.annualreviews.org/doi/abs/10.1146/annurev-biodatasci-080917-013459?journalCode=biodatasci https://www.nature.com/articles/d41573-019-00180-y

alanruttenberg commented 3 years ago

No doubt it is a problem. That doesn't mean we also don't have a failure of the collaboration principle.

On Wed, May 12, 2021 at 10:34 PM Melissa Haendel @.***> wrote:

Mondo focuses on alignment across many sources, DO is one of them. This has blocked our ability to align diagnostics and disease definitions around the world. There are multiple papers explaining this problem: https://www.annualreviews.org/doi/abs/10.1146/annurev-biodatasci-080917-013459?journalCode=biodatasci https://www.nature.com/articles/d41573-019-00180-y

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBO-Dashboard/issues/30#issuecomment-840244604, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB3CDRNO35BLZPGCKIYZKTTNM3EHANCNFSM4WA6TCFQ .

matentzn commented 3 years ago

Thank you all for the perspectives. Irrespective of the question of wether xrefs are good or bad, it’s a fact that there are 2 million of these, many of which between OBO ontologies. This won’t change any time soon, so I will incorporate them into the impact score as @lschriml suggests, and we will discuss at some future point what our general stance is as OBO Foundry. I am very excited about all these debates now being held! 🙏🖖🏼😃

alanruttenberg commented 3 years ago

I don't see any kind of consensus for having inter-ontology dbxrefs boost the impact score. What I do see agreement on is that where something other than an ontology uses a dbxref to an ontology, that is considered positive. When an ontology has a dbxref to something other than an ontology that contributes to utility, but isn't relevant for determining impact.

So, I propose that if we are going to count inter-ontology dbxrefs then they are counted separately from the others and that only the others, pending resolution of this issue, can boost the impact score. Further, in the display of inter-ontology dbxrefs the unresolved concern documented here is advertised, so that we aren't seen as promoting the practice.

I remain of the opinion that the inter-ontology dbxrefs are a negative rather than a positive signal.

bpeters42 commented 3 years ago

I agree with alan.

On Thu, May 13, 2021 at 6:54 AM Alan Ruttenberg @.***> wrote:

I don't see any kind of consensus for having inter-ontology dbxrefs boost the impact score. What I do see agreement on is that where something other than an ontology uses a dbxref to an ontology, that is considered positive. When an ontology has a dbxref to something other than an ontology that contributes to utility, but isn't relevant for determining impact.

So, I propose that if we are going to count inter-ontology dbxrefs then they are counted separately from the others and that only the others, pending resolution of this issue, can boost the impact score. Further, in the display of inter-ontology dbxrefs the unresolved concern documented here is advertised, so that we aren't seen as promoting the practice.

I remain of the opinion that the inter-ontology dbxrefs are a negative rather than a positive signal.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBO-Dashboard/issues/30#issuecomment-840577011, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJX2ITKZFG6BW2WYVZ7N6TTNPKYNANCNFSM4WA6TCFQ .

-- Bjoern Peters Professor La Jolla Institute for Immunology 9420 Athena Circle La Jolla, CA 92037, USA Tel: 858/752-6914 Fax: 858/752-6987 http://www.liai.org/pages/faculty-peters

nataled commented 3 years ago

In general I also agree with Alan, except there are (I think) legitimate cases for an ontology to contain inter-ontology dbxrefs that are NOT a failing on the part of the ontology containing those dbxrefs. I can think of two types of cases. In one, the dbxref is to a term in an external ontology that has been (or will be) retired by that external ontology. In the second, the dbxrefs are to terms in the external ontology that ITSELF failed to re-use the primary source ontology.

lschriml commented 3 years ago

I agree with Alan, Peter and Darren.

matentzn commented 3 years ago

Oh you agree @lschriml ? Great. I was preparing a defence speech here for your proposal :-) Alright then :). Please correct me if I am wrong. We do this:

[ ] we count inter-ontology dbxrefs separately from others and not include them in the overall obo score (impact). That means concretely, contrary to what I thought @lschriml wanted, xrefs from other OBO ontologies will not boost your score.
[ ] dbxrefs from non-ontology sources could boost, if we find a way to count them (future work)

Alright, while I don't quite agree with the severity of the argumentation against xrefs from a practical standpoint (for me, xrefs just means "loose mapping, useful for machine learning algorithms"), I know where the OBO foundry is coming from - the whole vision of logical interoperability. I am ok with bowing to what appears here to be some kind of majority consensus and keep the score separate. But just for posterity, I want to say that I in areas such as disease or social stuff, where modelling is driven not so much by some kind of physical reality but the (often obscure) textbooks studied during undergraduate studies, we can end up describing the same concepts in incompatible logical frameworks. In such cases, I would argue that an xref is still better than nothing. I do however see that logical integration is of course the holy grail.

nataled commented 3 years ago

Further to my point, I just realized that in both of the cases I mentioned, the dbxrefs would be to ontologies with a different scope and of a different type. To provide a concrete example, PRO (a reference ontology for proteins) was asked to take over some protein terms from MIRO/IRO (application ontologies for insect resistance). I would say that if ontology A has dbxrefs to ontology B and both have the same scope--and especially if both are reference ontologies--then it is cause for concern. Might even be a problem even if both are reference ontologies with different scope. Not sure how to code all that without some metadata in place, but at least this gives something to go on.

lschriml commented 3 years ago

Hello @Nico Matentzoglu - apologies for any confusion :)

as xrefs are currently represented: yes --> we count inter-ontology dbxrefs separately from others and not include them in the overall obo score

Regarding earlier posts: I am in favor of term reuse over xrefing, as in the example of DO and MONDO. Instead of inter-ontology dbxrefs, we encourage usage of the primary ontology term ID instead of it's xref usage. This would promote collaborative development. And increase the interoperability across OBOF ontologies.

Cheers, Lynn

On Thu, May 13, 2021 at 11:00 AM Nico Matentzoglu @.***> wrote:

Oh you agree @lschriml https://github.com/lschriml ? Great. I was preparing a defence speech here for your proposal :-) Alright then :). Please correct me if I am wrong. We do this:

we count inter-ontology dbxrefs separately from others and not include them in the overall obo score (impact). That means concretely, contrary to what I thought @lschriml https://github.com/lschriml wanted, xrefs from other OBO ontologies will not boost your score.

dbxrefs from non-ontology sources could boost, if we find a way to count them (future work)

Alright, while I don't quite agree with the severity of the argumentation against xrefs from a practical standpoint (for me, xrefs just means "loose mapping, useful for machine learning algorithms"), I know where the OBO foundry is coming from - the whole vision of logical interoperability. I am ok with bowing to what appears here to be some kind of majority consensus and keep the score separate. But just for posterity, I want to say that I in areas such as disease or social stuff, where modelling is driven not so much by some kind of physical reality but the (often obscure) textbooks studied during undergraduate studies, we can end up describing the same concepts in incompatible logical frameworks. In such cases, I would argue that an xref is still better than nothing. I do however see that logical integration is of course the holy grail.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBO-Dashboard/issues/30#issuecomment-840619810, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABBB4DMJMWLGWU2DPCORB43TNPSP7ANCNFSM4WA6TCFQ .

-- Lynn M. Schriml, Ph.D. Associate Professor

Institute for Genome Sciences University of Maryland School of Medicine Department of Epidemiology and Public Health 670 W. Baltimore St., HSFIII, Room 3061 Baltimore, MD 21201 P: 410-706-6776 | F: 410-706-6756 @.***

mellybelly commented 3 years ago

Doing a logical analysis across ontologies for logical coherence (whether via URI reuse or xrefs) is probably the best approach to examining and promoting interoperability. There are a lot of poorly defined and conflicting xrefs (see the paper above). @LEHunter had done some work examining this I think. We've been talking about doing this type of analysis in OBO for a long time.

FWIW, as a metric, I don't think the number of xrefs is a quality metric. In some cases it will be inversely correlated and in others positively correlated. Metric design should be considered carefully in terms of what the intended outcome actually is (i think about evaluation too much these days). I also don't believe that an xref=usage, and such a metric could actually promote further poor xref'ing.

alanruttenberg commented 3 years ago

I think @nataled's notes about various understandable uses of xrefs are on the right track. It would be profitable to have some more exploration to enumerate them and then then issue some guidance. One thing that we could do now is to add a question to the request to be in the library along the lines of: "If you use xrefs in your ontology, please explain the reasons/rationale for doing so".

And, of course, I'm all for @mellybelly's point about cross ontology logical coherence.

OBOFoundry / OBO-Dashboard

Include dbxrefs in usage analysis #30