digitalbazaar / jsonld.js

A JSON-LD Processor and API implementation in JavaScript
https://json-ld.org/
Other
1.66k stars 195 forks source link

fromRDF is very slow (compared to alternative parsing methods) #320

Open rmeissn opened 5 years ago

rmeissn commented 5 years ago

I noticed that the fromRDF method is very slow in execution. In particular I requested about 50.000 triples from a store in N-Quads format and used fromRDF to convert these to json, in order to compact the resulting json with a custom context. fromRDF took about 80 seconds on my machine (NodeJS, v10.16), whereas rapper was able to parse the same N-Quads to json (of cause this is different json) in about 180ms. I haven't profiled fromRDF to find the cause, but I have seen that the CPU usage is at a 100% all the time.

I eventually solved this for my use case by using the triple store to create json-ld as a response instead of n-quads (adds about 500ms to the request, in comparison to nquads). I changed the store recently and hadn't noticed that the new one supports json-ld out of the box.

Nevertheless it seems awkward for the fromRDF method to have such a cpu footprint.

Use this sample program with some data of your choice to test the behaviour:

'use strict';

const jsonld = require('jsonld');
const fs = require('fs');

async function ntriplesToJSONLD (nquads) {
    let time1 = new Date().getTime();
    const toCompact = await jsonld.fromRDF(nquads);
    let time2 = new Date().getTime();
    let compacted = await jsonld.compact(toCompact, {/*some context fitting the data*/}, {'processingMode': 'json-ld-1.1'});
    let time3 = new Date().getTime();
    console.log('timings in milliseconds');
    console.log('fromRDf: ' + (time2 - time1),'compact: ' + (time3 - time2));
    return compacted;
}

// TODO load some n-quads file from the filesystem by using fs
const filepath = 'nquads file';
const fileContent = fs.readFileSync(filepath, {encoding: 'utf8'});

(async function() {
    await ntriplesToJSONLD(fileContent);
})();

// TODO in comparison to: time rapper -i nquads -o json test.nq > test.json
dlongley commented 5 years ago

Yes, this definitely sounds excessive and we should look into it.