comunica / comunica-feature-link-traversal

📬 Comunica packages for link traversal-based query execution
Other
8 stars 11 forks source link

Naive link traversal for TREE RDF #77

Closed constraintAutomaton closed 1 year ago

constraintAutomaton commented 2 years ago

Introduction

Naive link traversal for TREE. This implementation simply follows all the tree:relation available.

Implementation

A ActorExtractLinksTree actor has been created the test method of this actor accept any action. The run method follow any tree:relation that has for implicit subject the current page.

Limitation

It seems that some TREE and LDES has inconsistent URL relation for example there URL execute an HTTP redirect and the document still referees to the original link.

The test method returns always true, which means that the actor must be chosen carefully in the configuration.

There is no stopping condition for the reading of relation which means that a document with a lot of tree:member will be slow to read in relation to the number of potential relations.

Testing method

A unit test has been implemented.

As there is no system tests currently, manual test has been done.

Engine configuration

The configuration file engines/query-sparql-link-traversal/config/config-default.json has the current values

{
  "@context": [
    "https://linkedsoftwaredependencies.org/bundles/npm/@comunica/config-query-sparql-link-traversal/^0.0.0/components/context.jsonld"
  ],
  "import": [
    "ccqslt:config/config-base.json",
   "ccqslt:config/extract-links/actors/tree.json"
  ]
}

Test Script

in the root of the repo an index.js file is created containing those lines

const communica = require("@comunica/query-sparql-link-traversal");
const log = require("@comunica/logger-pretty");

new communica.QueryEngineFactory().create({ configPath: './engines/query-sparql-link-traversal/config/config-default.json' }).then(
  (engine) => {
    engine.queryBindings(`
  SELECT ?s WHERE {
    ?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://w3id.org/tree#node>.
  }`, {
      sources: ['https://treecg.github.io/demo_data/vtmk/f.ttl'],
      log: new log.LoggerPretty({ level: 'trace' }),
    }).then((bindingsStream) => {
      bindingsStream.on('data', (binding) => {
        console.log(binding.toString());
      });

    });
  }
);

Validation

Select a sources from the website https://treecg.github.io/TREE-LDES-visualizer/. Run node index.js in the root of the repo. Validate if the debugging info provided by the logger match the "Selected resource" information provided by the TREE Validator.