rubensworks / jsonld-streaming-parser.js

A fast and lightweight streaming JSON-LD parser for JavaScript
https://www.rubensworks.net/blog/2019/03/13/streaming-rdf-parsers/
MIT License
80 stars 13 forks source link

How to parse structured JSON-LD files #26

Closed templth closed 5 years ago

templth commented 5 years ago

Hello,

I'm trying to parse structured JSON-LD files with the library.

The file is a structured JSON-LD so all the elements I'm interested in are located at the root of the @graph:

    {
      "@context": {
        "@vocab": "https://www.datatourisme.gouv.fr/ontology/core#",
        "schema": "http://schema.org/",
        "bd": "http://www.bigdata.com/rdf#",
        (...)
      },
      "@graph": [{
        "@id": "https://data.datatourisme.gouv.fr/3/06a7f439-3e02-3aa2-8301-f850bb5b792f",
        "dc:date": [{
          "@value": "2013-10-30",
          "@type": "xsd:date"
        },{
          "@value": "2019-08-30",
          "@type": "xsd:date"
        }],
        "dc:identifier": "eudonet:52945",
        "@type": ["schema:Landform","NaturalHeritage","PlaceOfInterest","PointOfInterest","urn:resource"],
        "rdfs:label": {
          "@value": "L'arbre du Pied Cornier",
          "@language": "fr"
        },
        (...)
      }]
    }

I can parse the file using the following code:

    const JsonLdParser = require('jsonld-streaming-parser').JsonLdParser;

    const parser = new JsonLdParser();

    const getStream = () => {
      const jsonData = 'flux-5339-201909240851.partial.jsonld';
      const stream = fs.createReadStream(jsonData, {encoding: 'utf8'});
      return stream.pipe(parser);
    };

    getStream()
      .on('data', (data) => {
        console.log('data = ', data);
      })
      .on('error', () => {
        console.error(error);
      })
      .on('end', () => {
        console.log('All triples were parsed!');
      });

I expect to have the comprehensive content for an element within the data callback but got this:

{
  "subject": {
    "value": "https://data.datatourisme.gouv.fr/3/06a7f439-3e02-3aa2-8301-f850bb5b792f"
  },
  "predicate": {
    "value": "http://purl.org/dc/elements/1.1/date"
  },
  "object": {
    "value": "2013-10-30",
    "datatype": {
      "value": "http://www.w3.org/2001/XMLSchema#date"
    },
   "language": ""
  },
  "graph": {
    "value": ""
  }
}

Thanks for your help! Thierry

rubensworks commented 5 years ago

I expect to have the comprehensive content for an element within the data callback but got this

This library parses JSON-LD to RDF triples/quads, so this does not directly allow you to do what you want.

I'm not sure how you envision a comprehensive content for an element, but additional RDF(JS) library may exist for this.

templth commented 5 years ago

Thanks very much for your answer!

Agreed that it doesn't seem the right approach for my needs...