ocaml-community / yojson

Low-level JSON parsing and pretty-printing library for OCaml
https://ocaml-community.github.io/yojson/
BSD 3-Clause "New" or "Revised" License
325 stars 58 forks source link

RFC 7159 compliancy #34

Open 314eter opened 7 years ago

314eter commented 7 years ago

I tested Yojson on the test cases of https://github.com/nst/JSONTestSuite:

CRASH   n_array_extra_comma.json
CRASH   n_array_incomplete_invalid_value.json
CRASH   n_array_number_and_comma.json
CRASH   n_object_pi_in_key_and_trailing_comma.json
CRASH   n_object_trailing_comma.json
CRASH   n_object_with_single_string.json
CRASH   n_structure_end_array.json
CRASH   n_structure_lone-invalid-utf-8.json
CRASH   n_structure_open_array_apostrophe.json
CRASH   n_structure_open_array_comma.json
CRASH   n_structure_open_object_close_array.json
CRASH   n_structure_open_object_comma.json
CRASH   n_structure_open_object_open_array.json
CRASH   n_structure_single_point.json
CRASH   n_structure_single_star.json
SHOULD_HAVE_PASSED  y_string_utf16.json
SHOULD_HAVE_FAILED  n_number_infinity.json
SHOULD_HAVE_FAILED  n_number_minus_infinity.json
SHOULD_HAVE_FAILED  n_number_NaN.json
SHOULD_HAVE_FAILED  n_object_repeated_null_null.json
SHOULD_HAVE_FAILED  n_object_trailing_comment.json
SHOULD_HAVE_FAILED  n_object_trailing_comment_slash_open.json
SHOULD_HAVE_FAILED  n_object_unquoted_key.json
SHOULD_HAVE_FAILED  n_string_invalid_utf-8.json
SHOULD_HAVE_FAILED  n_string_iso_latin_1.json
SHOULD_HAVE_FAILED  n_string_lone_utf8_continuation_byte.json
SHOULD_HAVE_FAILED  n_string_overlong_sequence_2_bytes.json
SHOULD_HAVE_FAILED  n_string_overlong_sequence_6_bytes.json
SHOULD_HAVE_FAILED  n_string_overlong_sequence_6_bytes_null.json
SHOULD_HAVE_FAILED  n_string_unescaped_crtl_char.json
SHOULD_HAVE_FAILED  n_string_unescaped_newline.json
SHOULD_HAVE_FAILED  n_string_unescaped_tab.json
SHOULD_HAVE_FAILED  n_string_UTF8_surrogate_U+D800.json
SHOULD_HAVE_FAILED  n_structure_<null>.json
SHOULD_HAVE_FAILED  n_structure_object_with_comment.json

The problems are:

  1. Some inputs (e.g. Yojson.Safe.from_string "x") raise a "Failure "lexing: empty token" exception instead of a Json_error.
  2. The numeric values NaN and Infinity, duplicate keys, comments, unquoted keys and unquoted control characters, tabs or newlines are not permitted according to RFC 7159. I don't think it's a problem to support some extensions to the specification, as long as to_string always returns valid json. In that case, only NaN and Infinity and duplicate keys are problematic.
  3. All other problems are UTF8 and UTF16 related. The handling of UTF8 and UTF16 is a design choice, so these can be ignored.
mjambon commented 7 years ago

Cool.

mjambon commented 7 years ago

This is a great test suite, and yojson doesn't have any. Would you like to leave a script that would fetch and run this test suite?

NathanReb commented 5 years ago

I'll add the successful ones to the current test suite and the rest of the test cases to a separate one.

Then we can start fixing the ones worth fixing and migrate them from one test suite to the other to ensure we don't break compliance in the future.

314eter commented 5 years ago

The crashes were all fixed by #35, so only the SHOULD_HAVE_FAILED are left.

NathanReb commented 5 years ago

Yeah I just ran the test suite I added right now with Yojson.Safe and the remaining failures are:

I'll polish the test suite a bit to get a better output on the failures and get it merged so we have something to work on.

NathanReb commented 5 years ago

It seems like it's a slightly different set of failures but the test suite now covers the more up-to-date RFC 8259 which explains some differences.