Closed mistermboy closed 3 years ago
Some facts:
HDT is a compact data structure and binary serialization format for RDF that keeps big datasets compressed to save space while maintaining search and browse operations without prior decompression.
There is a GUI Desktop Tool to work with HDT. This tool allows us to convert RDF files to HDT format. I tried to transform a TTL file with 10M triples to an HDT file and it just took a few minutes.
In the official website of HDT they provide some big datasets as Wikidata or DBPedia in HDT format already,
Apache Jena provides a HDT library to work with it.
It´s said that is posible to deploy a SPARQL Endpoint of HDT files using Jena Fuseki. However, I tried to upload an HDT file on one of my fuseki instances and it pointed an syntax error. There is a manual for doing that which says that you can do it in a few minutes but it doesn´t seem that easy. Some of the example file links are down. Also, there is a repo for the same porpouse but after compiling I get this error when I try to run the server:
Due Scholia queries uses nonstandard blazegraph futures, which are not supported by HDT, we will not research further about HDT by now in order to reuse the queries
Play with HDT, try to convert some RDFs to HDT format and query them. Please add some instructions and the flow flowed. Add also, if possible, some feedback about it.