The following Python modules need to installed:
All configuration settings should be in config.py
file which should be created from config.py.example
by renaming it.
The list of input urls are set as a Python list to input_urls
variable.
Parser uses DBpedia to extract the names of countries and univeristies, and their URIs in DBpedia.
There are three options:
sparqlstore['dbpedia_url']
should be changed to http://lod.openlinksw.com/sparql
,sparqlstore['dbpedia_url']
should be set to the local SPARQL Endpoint and the RDF files dumps/dbpedia_country.xml
and dumps/dbpedia_universities.xml
should be uploaded to it. Look at the wiki to find the steps to generate the DBpedia dumps.Once you finished with the configuration you need just to execute the following script:
python CeurWsParser/spider.py
The dataset will be in rdfdb.ttl
file.
SPARQL queries created for the Task 1 as translation of the human readable queries to SPARQL queries using our data model. The queries are in the wiki.
Maxim Kolchin (kolchinmax@gmail.com)
Fedor Kozlov (kozlovfedor@gmail.com)