Bioschemas parser in Javascript

Create a script that parses Bioschemas content.

Some of our content providers mark-up their events content using Bioschemas specifications. Bioschemas can be represented in either JSON-LD, RDFa, or Microdata formats. Just focus on JSON-LD for this exercise. If you have time later, maybe explore the others but no worries if not.

The Bioschemas Event specification is represented in a YAML format https://github.com/BioSchemas/specifications/blob/master/Event/specification.html

Write a program that:

parses this file and takes everything in the properties: key in the YAML specification.
Goes through and collect each property name
Download the schema.org spec for each of the expected types. e.g. if expected_type has PostalAddress you need to parse schema.org/PostalAddress.jsonld to get the properties of this subtype
Downloads a target Bioschemas web-page
Parse the JSON-LD, maybe using a parser such as this: https://www.npmjs.com/package/@rdfjs/parser-jsonld
Extract all the properties that match the ones you've collected from the YAML
Extract any sub-properties
Push these properties to TeSS using the TeSS API Client

Whilst implementing, think about how you make this as re-usable as possible. e.g. The developer will only have to change the URL of the target page to run it elsewhere.

Some target websites to test it against:

ElixirTeSS / TeSS_scrapers

Bioschemas parser in Javascript #71