HDT Research - Githubissues

mistermboy commented 3 years ago

Play with HDT, try to convert some RDFs to HDT format and query them. Please add some instructions and the flow flowed. Add also, if possible, some feedback about it.

mistermboy commented 3 years ago

Some facts:

HDT is a compact data structure and binary serialization format for RDF that keeps big datasets compressed to save space while maintaining search and browse operations without prior decompression.
There is a GUI Desktop Tool to work with HDT. This tool allows us to convert RDF files to HDT format. I tried to transform a TTL file with 10M triples to an HDT file and it just took a few minutes.
In the official website of HDT they provide some big datasets as Wikidata or DBPedia in HDT format already,
Apache Jena provides a HDT library to work with it.
It´s said that is posible to deploy a SPARQL Endpoint of HDT files using Jena Fuseki. However, I tried to upload an HDT file on one of my fuseki instances and it pointed an syntax error. There is a manual for doing that which says that you can do it in a few minutes but it doesn´t seem that easy. Some of the example file links are down. Also, there is a repo for the same porpouse but after compiling I get this error when I try to run the server:

mistermboy commented 3 years ago

Due Scholia queries uses nonstandard blazegraph futures, which are not supported by HDT, we will not research further about HDT by now in order to reuse the queries

weso / weso-scholia

HDT Research #3