Evaluate error status disregarding error count

svacas commented 8 years ago

As TCK tests should be minimal and contain at most a single error there's no need to compare the error count

ddenisenko commented 8 years ago

Sometimes you just need to have more than a single error in a test. Example:

MyType:
  properties:
    p1: string
  additionalProperties: false
  example:
    p_1: asd

Here we have 2 errors: one for missing p1 property and another one for unknown p_1 property.

While in theory minimal tests are good, we also want to include real-life user examples to better catch regressions, and anyway right now there are quite a few tests in TCK with more than a single error. Why limit ourselves with a single error per test when we can easily support several ones?

Also, the final goal is to compare parser JSON output, not just test a number of errors as of now for Java parser, or even a single binary state of "having errors or not" as this pull request implies.

And that JSON besides actual AST does list all the errors, including error messages and positioning.

eleonoraortega commented 8 years ago

@ddenisenko This is correct in the case you are showing. These are two different errors.

But that is not TCK problem, it has another issue:

in this example:

#%RAML 1.0
title: API

types:
  SimpleType1:
    properties:
      property1: string
      property2: string

  SimpleType2:
    properties:
      property3: string
      property4: string

  SimpleUnion:
    type: SimpleType1 | SimpleType2

  TypeWithUnionProps:
    type: object
    properties:
      union1: SimpleUnion
      union2: SimpleUnion
    example:
      union1:
        property3: blahblah
        property2: blahblah
      union2:
        property3: blahblah
        property4: blahblah

This is the output:

        expected:
                {"path":"apiInvalid.raml","code":11,"line":23,"column":6,"range":[{"line":23,"column":6,"position":368},{"line":23,"column":12,"position":374}],"position":368,"message":"Union type option does not pass validation (SimpleType1: Required property: property1 is missed)"}
                {"path":"apiInvalid.raml","code":11,"line":23,"column":6,"range":[{"line":23,"column":6,"position":368},{"line":23,"column":12,"position":374}],"position":368,"message":"Union type option does not pass validation (SimpleType2: Required property: property4 is missed)"}
                {"path":"apiInvalid.raml","code":11,"line":23,"column":6,"range":[{"line":23,"column":6,"position":368},{"line":23,"column":12,"position":374}],"position":368,"message":"Union type options do not pass validation"}
        actually:
                Invalid element {
property3: blahblah,
property2: blahblah
}.

Right now, to verify if there is an error or to count the errors is the same, does not affect the quality today

ddenisenko commented 8 years ago

Well, I personally prefer the amount of details provided by the first example.

If you are sure that the current single error output provides user with sufficient information, there can be created a mechanism, which allows a particular parser test suite to set up an override for the number of errors in a particular case. Each case of overriding should be investigated by a human eyes, making sure its because of different style of error output and not due to the actual bug or any other reasons.

While this sample demonstrates the difference in error output style, I am pretty sure we can find out some tests in TCK, which can be cut off to contain as single error as @svacas suggests, and that will make the test better. But, as a general matter, it should not prevent us from supporting a case where there are several errors. Restricting ourselves from supporting several errors at once is just too much.

eleonoraortega commented 8 years ago

Hi @evangelina @sichvoge We need to move forward on this. Can you resolve this today please? Thanks!

svacas commented 8 years ago

It's not an error that different parsers show different number of errors as long as the errors are detected. The number of errors will depend on the implementation and how far is parsed once an error is found. Also as we use external libraries for xsd and json schema validation, we are relying on the errores reported by those libraries in those cases.

ddenisenko commented 8 years ago

I agree, that this is sometimes not a error to have different error messages or even the number of errors. However, if we want to maintain parser compatibility, we want to make the number of differences to be low.

So it would be really good if the testing framework enforces a human to take a look at any difference and verify that there is a good reason to have this difference.

Also, even the number of errors is not a very good criteria to verify that some functionality passes a test. After JS parser was migrated from testing a number of errors to testing the exact error messages, we found a lot of bugs hiding behind this difference in test approach. And boolean (have error or not) will hide even more bugs than a number of errors test.

So, guys, its up to you, it is your parser after all. I am just telling what we discovered. The best would be to compare the whole JSON, but it is certainly something for later. But right now, if I got it right, you are close to pass the number of errors check, and only several tests cause trouble due to the number of errors difference. If you introduce an overriding mechanism to mark such tests, you could just mark those several ones and still maintain better level of compatibility with JS parser and better verification.

eleonoraortega commented 8 years ago

@ddenisenko It is ok from our side then. Let me know if you need anything in order to merge this. Thank you so much. Eleonora

raml-org / raml-tck

Evaluate error status disregarding error count #62