cmungall / obo-foundry-operations-committee

Automatically exported from code.google.com/p/obo-foundry-operations-committee
0 stars 0 forks source link

Automated metrics for resources listed on the obofoundry website #5

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Extract any relevant metrics that can be automated into a candidate list of 
automatable metrics as a pre-step for our review process. 

BioPortal shows: #classes, #individuals, #properties, max depth, max siblings, 
classes with single subclass, classes with more than 25 subclasses, no 
definition. (As far as I can tell their metrics are a bit broken as regards the 
actual numbers they report, but the intention is there -- for example they 
include MIREOTED and upper-level entities from different namespaces in their 
“count” for the ontology, which I think is wrong.)

Of the OBO Foundry criteria, we could write a script to automatically check 
many of them: 
1) open; 2) format; 3) URIs, 5)  uniqueness of content as assessed by label 
clashes with other OBO ontologies; 6) text definitions; 7) *use* of RO 
relations or other shared relations can be automated; 9) adoption in other 
ontologies via cross-products and citations of publication;  14) use of BFO; 
15) Asserted single inheritance; 18) orthogonality (same as #5), 

AI for tech group with Janna: Write code for metrics. See if Bioportal wants to 
use them.
Calculate these metrics for all candidate ontologies on OBO Foundry page, as 
proof of concept

Original issue reported on code.google.com by mcour...@gmail.com on 10 Oct 2012 at 5:52

GoogleCodeExporter commented 9 years ago
Note that OWLTools contains code for calculating basic statistics. As part of 
the OORT release process a ".metadata" file is generated for every derived 
ontology. This could easily be extended for other tags.

Currently an ad-hoc format is used. I would much rather that this produced 
metadata as triples. We need to standardize the vocabulary for this. Should 
this go in ontology-metadata.owl? I would rather the annotation properties were 
human readable but open to discussion.

Original comment by cmung...@gmail.com on 17 Oct 2012 at 5:15

GoogleCodeExporter commented 9 years ago
I agree with using Oort code for that and I totally agree that we should 
translate the current Oort format into triples. Probably IAO ontology metadata 
is the good place to host the properties we need. We can discuss today in the 
call but we can have as a task to define the properties needed.

Original comment by carlotor...@gmail.com on 17 Oct 2012 at 5:19

GoogleCodeExporter commented 9 years ago
Do you have a pointer to such a .metadata file and/or the list of statistics 
currently being compiled?

Original comment by mcour...@gmail.com on 17 Oct 2012 at 5:25

GoogleCodeExporter commented 9 years ago
There's lot's of oort output here:
http://build.berkeleybop.org/

E.g.
http://build.berkeleybop.org/job/build-uberon/lastSuccessfulBuild/artifact/main/
uberon-metadata.txt

apologies for the weird shouty caps java properties style tags.

the output includes a table showing property vs axiom type. This is probably 
overloading what we'd want to do in triples at least at first. I'm imagining 
first standardizing the basic statistics, clearly documenting what "class 
count" means, possibly with subproperties for the different senses (with/out 
deprecated, with/out MIREOTed). The tables can live separately in the interim

Original comment by cmung...@gmail.com on 17 Oct 2012 at 5:34

GoogleCodeExporter commented 9 years ago
One crucial stat we need to add is the number of classes / properties MIREOTed 
form other OBO ontologies.
This is one of the primary stat of interest for a first evaluation of the OBO 
principle compliancy. 

Original comment by carlotor...@gmail.com on 17 Oct 2012 at 5:37

GoogleCodeExporter commented 9 years ago
Another option is to have a SPARQL query compute the stats. More declarative 
than java. The sparql could be an AP in ontology-metadata.owl. OntoBee could 
just run this.

Original comment by cmung...@gmail.com on 17 Oct 2012 at 6:00

GoogleCodeExporter commented 9 years ago
I actually like this idea. 

Original comment by carlotor...@gmail.com on 17 Oct 2012 at 6:05