owlcs / owlapi

OWL API main repository
812 stars 314 forks source link

Collect representative sample of ontologies for benchmarking #313

Open sesuncedu opened 9 years ago

sesuncedu commented 9 years ago

There are lots of corpuses, most of which seem to not be at the URLs they were at when the paper mentioning them was published :)

Measuring operations using samples stratified across multi-dimensional swings and roundabouts may suggest where tuning effort is needed.

@matthewhorridge and @tudorache were working on something for OWLED, which was supposed to have some metrics on the use of webprotege. The proceedings aren't linked to the program yet, but I assume they have some metrics , plus lots of ontologies and timestamped change logs.

sesuncedu commented 9 years ago

I have put a number of ontologies from the corpuses at http://web.stanford.edu/~horridge/publications/2014/iswc/atomic-decomposition/data/ into the gh-pages branch of owlapibenchmarks

The files are compressed using XZ. XZ files can be decompressed in java using XZ for Java.

Maven dependency:

<dependency>
    <groupId>org.tukaani</groupId>
    <artifactId>xz</artifactId>
    <version>1.5</version>
</dependency>

Gradle:

'org.tukaani:xz:1.5'