iaincollins / structured-data-testing-tool

A library and command line tool to help inspect and test for Structured Data.
https://www.npmjs.com/package/structured-data-testing-tool
ISC License
63 stars 14 forks source link

Validating JSON blobs against schemas? #9

Closed pdehaan closed 4 years ago

pdehaan commented 4 years ago

Hello,

Is there a way to use the API to validate a JSON/HTML file with the following data, or do I need to write a custom schema and rules?

{
  "@context": "https://schema.org/",
  "@type": "Product",
  "@ids": "https://www.blah.org/#blahorg",
  "urls": "https://www.blah.org/en-US/developer/",
  "logod": "https://www.example.com/example-logo.jpg",
  "images": [
    "https://example.com/photos/1x1/photo.jpg",
    "https://example.com/photos/4x3/photo.jpg",
    "https://example.com/photos/16x9/photo.jpg"
  ],
  "named": "Blah Org",
  "alternateName": "Blah",
  "brand": {
    "@type": "Brand",
    "@id": "https://www.blah.org/#brand",
    "name": "Firefox"
  },
  "sameAs": ["https://en.wikipedia.org/wiki/blah"],
  "offers": {
    "@type": "Offer",
    "url": "https://www.blah.org/developer/",
    "priceCurrency": "USD",
    "price": "0",
    "availability": "https://schema.org/InStock"
  }
}

Currently I'm using the following function and looping over a glob of .json files, but I'm getting zero warnings or failed rules:

async function lintFile(file, options = {}) {
  const txt = fs.readFileSync(file, "utf-8").toString();
  const html = `<script type="application/ld+json">${txt}</script>`;
  return structuredDataTestHtml(html, options);
}
iaincollins commented 4 years ago

Hi Peter!

That seems like a great use case! It's not currently supported, but based on this I'm happy to add support for both a new structuredDataTestJson method and auto-detection if a JSON object (or array of objects) is passed. The command line tool should also support validating JSON input.

The approach you have should work in the interim (I did a quick check and works for me), as long as you have an await before the function call:

async function lintFile(file, options = {}) {
  const txt = fs.readFileSync(file, "utf-8").toString();
  const html = `<script type="application/ld+json">${txt}</script>`;
  return await structuredDataTestHtml(html, options);
}

If this doesn't work for you and/or if the docs need updating anywhere to reflect this would appreciate if you can feedback (happy to help)…

However, ticket #5 is still in progress, which means that for now testing this way is likely of limited usefulness without a custom preset to check the properties and values are valid.

If you have specific use cases, very happy to focus on those first if you want to elaborate!

pdehaan commented 4 years ago

This is what I get from the command line:

npx structured-data-testing-tool --file ./bedrock/base/templates/includes/structured-data/product/firefox-developer-product.json --schemas jsonld:Product
Tests

  Schema.org > Product - 0% (0 passed, 1 total)
    ✕  schema in jsonld [Product[*]]

Statistics

  Number of Metatags: 0
  Schemas in JSON-LD: 0
     Schemas in HTML: 0
      Schema in RDFa: 0
  Schema.org schemas: 0
       Other schemas: 0
    Test groups run : 1
     Total tests run: 1

Results

    Passed: 0 (0%)
  Warnings: 0 (0%)
    Failed: 1 (100%)

  ✕ 1 of 1 tests failed with 0 warnings.

And here's the output (no errors or warnings) when I try using the API:

sdttt.js ```js const fs = require("fs"); const glob = require("glob").sync; const { structuredDataTestHtml } = require("structured-data-testing-tool"); const presets = require("structured-data-testing-tool/presets"); const schemas = require("structured-data-testing-tool/lib/schemas"); main("./bedrock/base/templates/includes/structured-data/product/firefox-developer-product.json"); async function main(g) { if (!g) { console.error("Missing glob"); process.exit(1); } for (const file of glob(g)) { try { // const SoftwareApplication = schemas.getSchema("SoftwareApplication"); const res = await lintFile(file); // , {schemas: [SoftwareApplication]}); // {presets: [presets.SocialMedia]}); const data = [].concat(res.failed, res.warnings); console.log(`${file}\n${JSON.stringify(data, null, 2)}\n\n`); if (res && (res.failed.length || res.warnings.length)) { console.error(file, res.failed, res.warnings); process.exitCode = 2; } } catch (err) { console.log("!!!", err); let res; if (err.res) { res = [].concat(err.res.failed, err.res.warnings) .map(err => err.error.message).sort(); } console.error(file, res); } } } async function lintFile(file, options = {}) { const txt = fs.readFileSync(file, "utf-8").toString(); const html = ``; return await structuredDataTestHtml(html, options); } ```
node sdttt
./bedrock/base/templates/includes/structured-data/product/firefox-developer-product.json
[]
iaincollins commented 4 years ago

Thanks @pdehaan - sorry hadn't missed the notification for this. Are you able to share the JSON file so can investigate (either to fix a bug and/or improve the docs - though sounds like maybe both)?

pdehaan commented 4 years ago

Yeah, sorry. Looks like the [draft] PR is public; https://github.com/mozilla/bedrock/pull/7664

I think I tried experimenting w/ linting the JSON files directly, as well as wrapping the files in a <script type="application/ld+json"> tag and writing the file out as an .html and linting that.

iaincollins commented 4 years ago

Hi @pdehaan,

It's taken a while for me to get round to, but if it's of use to you or anyone else:

As of version 4.1 JSON input is now supported in URLs, files and strings / buffers / streams / etc in both the API and Command Line Interface.

It actually treats it as described by automatically wrapping it with an appropriate script tag when the input validates as serialized JSON, so that the validation is exactly the same and it's automatic so no additional code or options are required to test JSON files.

Although there is improve schema evaluation in version 4, actual Schema.org property validation is part of the milestone for for version 5 so this still may not be useful to you, but this still may be useful for other folks.