pandastrike / jsck

JSON Schema Compiled checK
MIT License
158 stars 14 forks source link

add is-my-json-valid 2.0.2 to the benchmark #74

Closed mafintosh closed 9 years ago

mafintosh commented 9 years ago

(I noticed #73 added the error handler wrongly in the benchmark so I decided to fix that in a new pr)

Since is-my-json-valid 2.0.2 now passes the JSONSchema v4 test suite (except for remoteRef, unicode surrogate pairs in maxLength/minLength) this PR adds it to the benchmark.

On my macbook air the benchmark yields the following result (is-my-json-valid 2.0.2 is between 5x-10x faster than the 2nd fastest)

## Benchmarks for Draft 4

Schema: 'Event - Valid document'.  A simple schema, exercising very few attributes
Sample size: 64
Validations per sample: 1024

  JSCK
  Warming up: ................................
  Iterations: ................................................................

  tv4
  Warming up: ................................
  Iterations: ................................................................

  jayschema
  Warming up: ................................
  Iterations: ................................................................

  is-my-json-valid
  Warming up: ................................
  Iterations: ................................................................

  z-schema
  Warming up: ................................
  Iterations: ................................................................

  JSCK: validations/millisecond
  median: 209.557    max: 228.418    min: 123.642

  tv4: validations/millisecond
  median: 105.971    max: 121.356    min: 75.914

  jayschema: validations/millisecond
  median: 1.656    max: 2.016    min: 1.153

  is-my-json-valid: validations/millisecond
  median: 2144.503    max: 2566.416    min: 1083.598

  z-schema: validations/millisecond
  median: 111.937    max: 118.423    min: 67.466

Relative speeds:
is-my-json-valid : 1.000
JSCK : 10.234
z-schema : 19.158
tv4 : 20.237
jayschema : 1294.885

Schema: 'Configuration'.  A moderately complex schema with some nesting and value constraints
Sample size: 64
Validations per sample: 256

  JSCK
  Warming up: ................................
  Iterations: ................................................................

  tv4
  Warming up: ................................
  Iterations: ................................................................

  jayschema
  Warming up: ................................
  Iterations: ................................................................

  is-my-json-valid
  Warming up: ................................
  Iterations: ................................................................

  z-schema
  Warming up: ................................
  Iterations: ................................................................

  JSCK: validations/millisecond
  median: 116.417    max: 134.879    min: 81.789

  tv4: validations/millisecond
  median: 46.318    max: 50.137    min: 27.424

  jayschema: validations/millisecond
  median: 0.945    max: 1.096    min: 0.649

  is-my-json-valid: validations/millisecond
  median: 599.532    max: 1080.169    min: 212.272

  z-schema: validations/millisecond
  median: 44.195    max: 52.362    min: 23.178

Relative speeds:
is-my-json-valid : 1.000
JSCK : 5.150
tv4 : 12.944
z-schema : 13.566
jayschema : 634.276

Schema: 'Transaction'.
Sample size: 64
Validations per sample: 64

  JSCK
  Warming up: ................................
  Iterations: ................................................................

  tv4
  Warming up: ................................
  Iterations: ................................................................

  jayschema
  Warming up: ................................
  Iterations: ................................................................

  is-my-json-valid
  Warming up: ................................
  Iterations: ................................................................

  z-schema
  Warming up: ................................
  Iterations: ................................................................

  JSCK: validations/millisecond
  median: 11.221    max: 14.939    min: 5.365

  tv4: validations/millisecond
  median: 2.344    max: 3.25    min: 1.544

  jayschema: validations/millisecond
  median: 0.048    max: 0.049    min: 0.034

  is-my-json-valid: validations/millisecond
  median: 77.295    max: 103.393    min: 50.874

  z-schema: validations/millisecond
  median: 4.65    max: 5.018    min: 2.739

Relative speeds:
is-my-json-valid : 1.000
JSCK : 6.888
z-schema : 16.621
tv4 : 32.975
jayschema : 1600.178
automatthew commented 9 years ago

Fast work, you guys.

I updated json-schema-tests to use IMJV 2.0.2 in its examples, and it confirms you're passing almost all of the draft 4 tests. I'm seeing optional/bignum and optional/format(hostnames) failures still, but I can't assert that's not my problem. Let me know if you have any objections to the way I'm testing your lib.

mafintosh commented 9 years ago

Oh it seems I forgot to add the optional test cases. Give me a couple of minutes.

automatthew commented 9 years ago

Those won't pose performance problems anyway, @mafintosh.

mafintosh commented 9 years ago

@automatthew no you're right. 2.0.3 supports the optional tests now though for completeness (except for zeroTerminatedFloats)

mafintosh commented 9 years ago

To answer your question, yeah your example looks fine :+1:

automatthew commented 9 years ago

Here's my plan, @mafintosh.

Giles's talk of "fastest JSON Schema validator" was based on our benchmarking tool at the time we ran it. Implicit in our statement was also "supporting mostly all of draft X". We ignored some competing packages on that basis. IMJV now passes my threshold for inclusion in our benchmarks, so I'm almost certainly going to accept the PR (after I pull and play with it).

Then, we need to blog about it. Key theme: the power of Open Source and Open Boasting to improve the world. Possibly also about the comparative values of code generation versus JSCK's approach.

But then there's some goalpost-moving that I'll do … honorably.

So any further benchmarks we write will take into account these two factors, at least. I would also value any contributions or suggestions you have to make about the particular schemas and documents we're using, as well as the benchmarking setup in general.

mafintosh commented 9 years ago

Giles's talk of "fastest JSON Schema validator" was based on our benchmarking tool at the time we ran it. Implicit in our statement was also "supporting mostly all of draft X". We ignored some competing packages on that basis. IMJV now passes my threshold for inclusion in our benchmarks, so I'm almost certainly going to accept the PR (after I pull and play with it).

Then, we need to blog about it. Key theme: the power of Open Source and Open Boasting to improve the world. Possibly also about the comparative values of code generation versus JSCK's approach.

:+1:

As much as I love the existence of the official test suite, i'm not convinced it's complete enough to catch the actually interesting failures of validators. Thus I need to write some tests to check that all the validators we benchmark can detect subtle and/or funny invalidities in the subject documents. This is because I have found, more than once, situations where JSCK was passing all official tests, but still behaving badly by my understanding of the spec.

All for it! There should be a json-schema-tests module on npm that contains a good set of test cases

Error reporting. This is harder to benchmark fairly than successes. A validator that bails on the first error encountered is legitimate (and often desirable), but it obviously can (and probably will) have different performance than a validator that keeps trudging on through the whole document, collecting a list of grievances it can wave about at the end.

Completely agree. The first iteration of IMJV bailed on first error but later versions now always do a full pass to give a complete error report (available on validate.errors). The performance impact of this IMO is negligible as it only matters in error scenarios.

automatthew commented 9 years ago

The performance impact of this IMO is negligible as it only matters in error scenarios.

Except for DoS situations.

mafintosh commented 9 years ago

True. The the cost is still (more or less) the same as validating a valid document

automatthew commented 9 years ago

There should be a json-schema-tests module on npm that contains a good set of test cases

I love that package. It's the best software I've ever seen. Whoever wrote it should get a medal. And a bottle of Scotch. Lagavulin, preferably.

That package relies on the same official test suite you've added to IMJV, which tests individual JSON Schema keywords in limited and isolated circumstances. Complex interactions are not tested. It may well all add up to success, but I'm still skeptical.

Incidentally, JSCK has recently added some meta-tests (to assert whether a schema is valid or invalid) using a similar data-driven approach that should eventually be usable by any project in any language. This flushed out several problems in my code.

gilesbowkett commented 9 years ago

wow, that was fast.

gilesbowkett commented 9 years ago

also fyi @mafintosh I believe @automatthew wrote the json-schema-tests package on npm. :-)

mafintosh commented 9 years ago

\o/

gilesbowkett commented 9 years ago

well-deserved high five ^_^