AKSW / OntoWiki.UseCases

Collection of Use Cases, User Stories and Requirements for OntoWiki
0 stars 0 forks source link

LinkedSpending: Large amounts of data and complex structure #2

Open KonradHoeffner opened 3 years ago

KonradHoeffner commented 3 years ago

After being extremely impressed by using OntoWiki for Geoknow, we setup OntoWiki for the LinkedSpending project. However due to it having other properties, see the LinkedSpending paper, there were several problems. Also I think there were some forbidden paths like /resource, so if you want to put your instance for example under http://linkedspending.aksw.org/resource/example, it would break. But was many years ago, it may not be like this anymore.

Large amounts of data

RDF data cubes 627
triples 113 million
uncompressed N-Triples dump 24.5 GB
observations (cube cells) 5 million

OntoWiki didn't seem optimized for such large amounts of data, even with some modifications made by Michael Martin to simplify some OntoWiki index queries. Some of the SPARQL queries generated by OntoWiki would overload the SPARQL endpoint and cause drastic slowdown or even crash of the website. This could also be due to the resource restrictions of the virtual server, which has 6 GB of RAM and 3 CPU cores (may have been different at the time).

CubeViz plugin

While the CubeViz plugin was one of the few, or even as far as I know the only possibility at that time to visualize RDF data cubes in an editor, it also seemed to be hindered by the large amount of data. Also, the sparseness of the datacubes made it very hard to get meaningful visualizations. The paper contains one but it took a large amount of trial and error to set the facets to the correct values that results in a meaningful diagram.

Complex (meta-)structure of the RDF Data Cube Vocabulary

RDF Data Cubes are modelled using the RDF Data Cube Vocabulary and RDF is only a lower layer here. However OntoWiki works on the RDF-level so of course it cannot verify that your data cubes are correctly modelled or prevent you from breaking them. This is a general problem with OntoWiki and all other RDF editors, I don't know if there is even something you can do here. Theoretically, OntoWiki could read in some ontology or meta ontology or SHACL shapes or a similar mechanism and verify and protect your data with that in mind.

Break on Ubuntu 20.04 LTS Upgrade

An upgrade of the server from Ubuntu 16.04 to 18.04 went without problems but when going to 20.04 after that something broke. I don't remember what we did exactly (maybe something with the PHP version?) but I think Nathanael helped me to fix it after that. Docker is probably the solution to that but I wasn't comfortable with that so we fixed it in a direct installation.