GeoKnow / LinkedGeoData

OpenStreetMap for the Semantic Web
http://linkedgeodata.org
GNU General Public License v3.0
136 stars 32 forks source link

public LinkedGeoData.org website cannot be recreated via existing docs #35

Open TallTed opened 4 years ago

TallTed commented 4 years ago

Documentation is entirely focused on building a PostgreSQL repository, but the public endpoint is hosted in a (4+ year old build, as of this writing) of VOS, Virtuoso Open Source Edition.

LinkedGeoData documentation should enable a user to recreate the public website in their own environment and infrastructure, i.e., in a Virtuoso instance.

Aklakan commented 3 years ago

We are working on setting up a new public endpoint. However, our hardware resources at present aren't sufficient for hosting the whole planet (the 'old' setup ran out of disk space after using up ~3TB). The plan is to try to re-use the current hardware to host a europe chapter of LGD based on the pre-published chunks from http://download.geofabrik.de/ Over the past few weeks we had a collaboration with people from Ontop which led to a contribution of the lgd-ontop-web container in the develop branch. After all, LGD is based on RDB2RDF mappings on a postgres database.

The docker setup already allows for creating one owns setup of the data services. As for the web page itself at some point it should be recreated with a static site generator. At present it is an archaic wacko wiki instance for which I wouldn't want to invest time for porting it to docker. So don't expect this part soon.

I still need to finish up some polishing of the docker setup, but if all works out as it should we should see a revival of LGD soon :)

What then needs to be done is to do a revision of the ~8 year old mappings. In principle I think LGD's mappings should become aligned with those used in Sophox (which is unfortunately down at the time of writing this).

TallTed commented 3 years ago

It sounds like you didn't reach out to @OpenLink about this?

It seems worth a reminder that Virtuoso is a hybrid, multi-model DBMS, and can be loaded with SQL data which can be exposed as RDF via internally applied RDB2RDF mappings. This RDF exposure can be kept virtual or physically replicated and stored as RDF data within Virtuoso's own RDF quad store.

With Virtuoso Enterprise Edition, RDB2RDF mappings can also be applied to external SQL Data, in PostgreSQL or any ODBC- or JDBC-accessible DBMS. (JDBC-accessible external data also requires use of an ODBC-to-JDBC Bridge Driver.) Again, the RDF exposure can be kept entirely virtual or be replicated into Virtuoso's RDF quad store.

Aklakan commented 3 years ago

Yes, I did not reach out to openlink yet because the priority is to get this setup running again with the existing and recently contributed open source tooling (ontop) running so that anyone can easily replicate this work (including myself for deploying on our server). If you look at the architecture on the develop branch, there is in theory no problem in simply adding another RDBRDF mapper next to sparqlify and ontop - whether it is that easy in practice is a different matter. If @openlink wants to showcase their commercial store based on the resources at a later stage I don't have any objections and we can discuss once things are running again.

Aklakan commented 3 years ago

After ~10 years the website is finally ported to gh-pages so that everyone can just fork it ;) It still contains some outdated information but -besides making it easier for me to fix things just via git - this move now also allows for contributing improvements via PRs. There is also a monaco test bed setup with sparqlify, ontop and nominatim working together, but I am still encountering some issues with europe-wide data; the OSM data itself loaded successfully, but I am facing an issue when loading with nominatim and ontop currently complains about an inconsistent ontology - so I am investigating this. Once these things are fixed I'll look into whether the old dump scripts still work.