The OBO Impact score - Githubissues

Alongside our quality metrics #64, we want to take some notion of impact into account when for providing our final OBO score #26.

There are basically two elements of raw data available to us to determine impact:

The usage data recorded as part of the metadata
How many ontologies re-use how many terms of some ontology

Neither of them are perfect.

The usage data cannot be verified with 100% certainty, and also, we currently do not allow to record usage in private systems, which makes this somewhat incomplete. However, usage information does go through (pull request) review, so it is not that bad. Basing the score on this could help with people recording their usages more diligently.
How many ontologies import terms from an ontology. Let us say, 25 ontologies re-use terms from PATO. On the positive side, it is a bit more objective, as we it is harder to engineer (lets leave aside the fact that you could register 10 ontologies all of which import each other for now). However, some ontologies are used widely - very widely - like HPO, and will have a very bad score here as it is rarely necessary to import phenotype terms anywhere.
How many terms are being re-used how many times across how all OBO ontologies. The same problems as above, but it does help with accounting for the fact that it is better if 1000 "disease" terms are reused from one ontology then just the "disease" term itself by 100 ontologies.

I am very torn with all of this. Some people (like @cthoyt) may come and suggest GitHub stars. @cmungall will suggest his ontology API, which can crawl some key resources in the biomedical domain for the number of times a term is used in biocuration, another great metric. @bpeters42 later in this thread suggests number of citations.

None of the above is truly 100% satisfactory.

Why should "use for biocuration" (great for Uberon) be more important then "use for ontology engineering" (great for RO, OMO)?
Why should "GitHub stars" is a metric for GitHub popularity, which is a function of "how many of my users use Github" and "How much is GitHub part of my workflows" -- neither of which OBO Foundry cares much about (and should not).
citations is probably better than GitHub stars, but it requires, well, a publication. Not all ontologies have that, but are still used widely.

My personal tendency right this moment is to

abandon the impact score altogether, and simply roll with the OBO Dashboard score. But it's a weak tendency.
The next best thing is to rely on the usage data field, and apply stricter rules for their review (i.e only count the ones with websites including TERM IDs that also resolve).

Looking forward to your ideas!

EDIT: This thread is for discussion only, not for deciding anything. Everyone will want to promote the impact metric that will make their ontologies look the best, which is fine, but as OBO Foundry we want to decide this in a neutral way. @mellybelly question re governance is, therefore: here, no decision will be reached. Wants all arguments are heard, I will compile a list of options, and then we will call a vote!

While I did include GitHub stars in the OBO Community Health report (by external suggestion, mind you), I'd be hard-pressed to say it's an actual indicator of impact. E.g., there are some trash ontologies with near 10 stars but then again, the Protein Ontology also has close to 10 stars. It's basically impossible to tease apart how popularity affects stars vs. people who want to engage with the issue tracker and leave a star while they're there vs. anything else.

I'd suggest abandoning the impact score because it will likely be difficult create a meaningful objective metrics that aren't confounded by the willingness of the ontology owners to follow some best community practices, e.g., using GitHub as their primary place of doing curation. I'd much rather focus on "usefulness" and "goodness" metrics.

Final thought: impact is a lot like obscenity/pornography - I know it when I see it.

Unless I'm under some misapprehension, the purpose of the dashboard score is to provide an evaluation of the "fitness" of an ontology as that notion pertains to adherence to the principles. In my opinion, an impact score doesn't help evaluate this (but see caveat below). So what does an impact score tell you? When the score is high, one can probably assume the ontology is 'good' and usable content-wise. But a low impact score cannot imply an ontology is 'bad'. Thus, for the aspect it is intended to evaluate, it fails.

One could argue that the impact score does indeed reflect the principle of Users. On this I would agree. However, a sliding scale for this principle is unnecessary. Put another way, does having 20 users indicate your ontology is a better adherent to this principle than one with 4? I don't think so. Number of users is more a reflection of whether or not the ontology covers a domain that has widespread need. Indeed--given the purpose of the Users principle "to ensure that the ontology tackles a relevant scientific area and does so in a usable and sustainable fashion"--adherence to it is a binary function; either the ontology has enough users to indicate that it is useful, or it doesn't.

The above, coupled with the obvious issues in evaluation, says to abandon this.

[Sorry, I edited this a bunch immediately after posting.]

I think that the context for this is "Remove foundry distinction on front page" https://github.com/OBOFoundry/OBOFoundry.github.io/issues/1140. Currently the table of OBO ontologies has groups: reviewed foundry ontologies are at the top of the list, followed by library ontologies, then inactive, orphaned, obsolete. Within each group we sort alphabetically. When the foundry distinction is dropped, we will combine foundry with active. Maybe we then split active into "domain", "project", "member". Inactive/orphaned/obsolete stay at the bottom of the list, and I guess project/member are lower on the list, but let's talk about the 100+ domain ontologies that claim some distinct scope in OBO.

So a new user comes to the obofoundry.org and looks at the table to try to decide which ontology to use. If we actually had one domain ontology for each distinct scope the decision would be easier, but we often have many. How does the user decide?

They can sort by OBO Dashboard score (#64). A good Dashboard score helps the new user choose but it doesn't capture the benefit the user gets from an ontology that is widely (re)used.

I will foolishly pick an example that I care about: OBI vs CMO (Clinical Methods Ontology). OBI has a slightly better Dashboard score (at least on the last version of the formula), but it would be easy enough for CMO to beat OBI by fixing a few things. CMO has similar scope to OBI comes first alphabetically. OBI no longer has its foundry review status to bump it to the top of the list. As a new user, I would be likely to pick CMO.

However OBI terms are reused in 75 other OBO ontologies, while CMO is used in about 8. That's a benefit to using OBI that I would like to see reflected somehow on the obofoundry.org list.

But when we just use OBO "internal" reuse, we miss important "external" (re)use. And some ontologies get a lot of reuse via XRefs rather than PURLs -- shouldn't that count for something?

We've discussed and tested various options over the past few years, and haven't found anything that makes everyone happy. Maybe that's reason enough to abandon the attempt, but either way it seems like some projects will "win" and some will "lose".

I want to throw another metric out there for discussion: Citations according to google scholar to the primary publication(s). That comes with all the caveats associated with citations, but I think it broadly reflects real world use and impact of ontologies better. Citations will typically be given by someone who wants to acknowledge that they really used an ontology, and not in the sense of e.g. re-using 'organism' term from OBI. It will put ontologies like HPO and GO very high. We could limit citations to a time period like 'in the last 5 years' to avoid this being age dominated.

Most of all, when I pull citations to test rank a few ontologies, it reflects my intuition, at least in terms of order of magnitudes. Note that this is crudely picking the first paper I can find and listing total citations. And note that the vast majority of ontologies are in the GO - 2158 RO - 1310 DO - 842 BFO - 788 HPO - 425 OBI - 229 PR - 148 PATO - 144 XAO - 60 (xenopus ontology) ZFA - 45 (zebrafish) CMO - 22 MRO - 13 (MHC restriction ontology, one of mine)

Note that for someone working on MHC restriction or zebrafish, having a low impact rating to the ontology will not be considered a failure of that ontology - that just reflects the fields broad important.

And we can convert to log10 (1+citatoins) to display meaningful differences.

[image: image.png]

Very curious what others think.

On Thu, Feb 10, 2022 at 7:39 AM James A. Overton @.***> wrote:

I think that the context for this is "Remove foundry distinction on front page" OBOFoundry/OBOFoundry.github.io#1140 https://github.com/OBOFoundry/OBOFoundry.github.io/issues/1140. Currently the table of OBO ontologies has four groups: reviewed foundry ontologies are at the top of the list, followed by library ontologies, then inactive and obsolete. Within each group we sort alphabetically. When the foundry distinction is dropped, we will have three groups: active, inactive, obsolete. Inactive and obsolete stay at the bottom of the list, but let's talk about the 100+ active ontologies.

So a new user comes to the obofoundry.org and looks at the table to try to decide which ontology to use. If we actually had one domain ontology for each distinct scope the decision would be easier, but we often have many. How does the user decide?

They can sort by OBO Dashboard score (#64 https://github.com/OBOFoundry/OBO-Dashboard/issues/64). A good Dashboard score helps the new user choose but it doesn't capture the benefit the user gets from an ontology that is widely (re)used.

I will foolishly pick an example that I care about: OBI http://dashboard.obofoundry.org/dashboard/obi/dashboard.html vs CMO http://dashboard.obofoundry.org/dashboard/cmo/dashboard.html (Clinical Methods Ontology). OBI has a slightly better Dashboard score (at least on the last version of the formula), but it would be easy enough for CMO to beat OBI by fixing a few things. CMO has similar scope to OBI comes first alphabetically. OBI no longer has its foundry review status to bump it to the top of the list. As a new user, I would be likely to pick CMO.

However OBI terms are reused in 75 other OBO ontologies, while CMO is used in about 8. That's a benefit to using OBI that I would like to see reflected somehow on the obofoundry.org list.

But when we just use OBO "internal" reuse, we miss important "external" (re)use. And some ontologies get a lot of reuse via XRefs rather than PURLs -- shouldn't that count for something?

We've discussed and tested various options over the past few years, and haven't found anything that makes everyone happy. Maybe that's reason enough to abandon the attempt, but either way it seems like some projects will "win" and some will "lose".

— Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBO-Dashboard/issues/65#issuecomment-1035068208, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJX2IX5MQVWGN4NQ6LQ5BDU2PL4TANCNFSM5OA3FNKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Bjoern Peters Professor La Jolla Institute for Immunology 9420 Athena Circle La Jolla, CA 92037, USA Tel: 858/752-6914 Fax: 858/752-6987 http://www.liai.org/pages/faculty-peters

I edited the first post to ensure that its clear that this issue is really for giving everyone the opportunity to say their piece and concerns. After a while, I will compile all the arguments in the issue, and call a vote!

@bpeters42 I will add your citation idea, its probably better than stars, but not ideal because it favours older over newer ontologies.. But keep these ideas coming! And also the concerns.

any impact score will favor older ontologies. And that is not wrong - impact builds over time, something created this second cannot have impact (by definition).
I was proposing citations in the last 5 years, but didn't do that in the numbers I put together. Neither did I take into account that e.g HPO seems to publish a new paper every year.

On Thu, Feb 10, 2022 at 10:11 AM Nico Matentzoglu @.***> wrote:

I edited the first post to ensure that its clear that this issue is really for giving everyone the opportunity to say their piece and concerns. After a while, I will compile all the arguments in the issue, and call a vote!

@bpeters42 https://github.com/bpeters42 I will add your citation idea, its probably better than stars, but not ideal because it favours older over newer ontologies.. But keep these ideas coming! And also the concerns.

— Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBO-Dashboard/issues/65#issuecomment-1035262918, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJX2IWBBPAM2GEL7PXKR2TU2P5UXANCNFSM5OA3FNKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

-- Bjoern Peters Professor La Jolla Institute for Immunology 9420 Athena Circle La Jolla, CA 92037, USA Tel: 858/752-6914 Fax: 858/752-6987 http://www.liai.org/pages/faculty-peters

Ah ok, yeah, citations in the last 5 years is probably much better, sorry I missed that!

I am personally all game for whatever the community decides on. I think all these measures have something in favour of them! One thing we could do is capture all these metrics separately anyways, and not compile an OBO score from them, just use them to sort the table.

Another idea is to include at least the metadata verified usage count into the #64 OBO Dashboard score, to alleviate some of the bias created by small, formally correct ontologies that are not used anywhere. I could register nico.owl in OBO foundry, make sure the metadata is all perfect, and then get a score of 100% - risky business.

OBOFoundry / OBO-Dashboard

The OBO Impact score #65