json-schema-org / json-schema-spec

The JSON Schema specification
http://json-schema.org/
Other
3.43k stars 251 forks source link

Schema validation crashes with stack overflow #1528

Closed vmelamed closed 1 week ago

vmelamed commented 1 week ago

Hello,

In the last 2-3 months I've been building a schema that became quite large - almost 2000 lines with some minification. The schema is quite recursive - naturally - it defines JSON structure for the serialization of LINQ expressions (AST). Yesterday I added the last 4 types (literally at the bottom of the file) and all my tests crashed with stack overflow (if it helps, I am pasting the stack below). Then I tried other schema validators online and replaced JsonSchema with Newtonsoft's JSchema - they all work. If you have advice on how to fix it (unless it is a bug) I'll appreciate it very much, as I already committed to System.Text.Json (hence ergo Json.Schema) and adding Newtonsoft to the mix ... smells bad. You can find plenty of valid JSON documents in my test folders. If you need more information about the project, etc., please don't hesitate to contact me here or directly.

Thank you very much for the good work and previous support with other issues. Val

========== Starting test run ==========
[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.8.1+ce9211e970 (64-bit .NET 8.0.6)
[xUnit.net 00:00:00.08]   Starting:    JsonTests
The active test run was aborted. Reason: Test host process crashed : Stack overflow.
   at System.Buffers.IndexOfAnyAsciiSearcher.IndexOfAnyLookupCore(System.Runtime.Intrinsics.Vector256`1<Byte>, System.Runtime.Intrinsics.Vector256`1<Byte>)
   at System.Buffers.IndexOfAnyAsciiSearcher.IndexOfAnyVectorized[[System.Buffers.IndexOfAnyAsciiSearcher+DontNegate, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](Byte ByRef, Int32, System.Runtime.Intrinsics.Vector256`1<Byte> ByRef)
   at System.MemoryExtensions.IndexOfAny[[System.Byte, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.ReadOnlySpan`1<Byte>, System.Buffers.SearchValues`1<Byte>)
   at System.Text.Json.Utf8JsonReader.ConsumeString()
   at System.Text.Json.Utf8JsonReader.ConsumeValue(Byte)
   at System.Text.Json.Utf8JsonReader.ConsumeNextToken(Byte)
   at System.Text.Json.Utf8JsonReader.ConsumeNextTokenOrRollback(Byte)
   at System.Text.Json.Utf8JsonReader.ReadSingleSegment()
   at System.Text.Json.Utf8JsonReader.Read()
   at System.Text.Json.Utf8JsonReader.TrySkip()
   at System.Text.Json.JsonDocument.TryParseValue(System.Text.Json.Utf8JsonReader ByRef, System.Text.Json.JsonDocument ByRef, Boolean, Boolean)
   at System.Text.Json.Serialization.Converters.JsonArrayConverter.Read(System.Text.Json.Utf8JsonReader ByRef, System.Type, System.Text.Json.JsonSerializerOptions)
   at Json.More.JsonSerializerOptionsExtensions.Read[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Text.Json.JsonSerializerOptions, System.Text.Json.Utf8JsonReader ByRef, System.Text.Json.Serialization.Metadata.JsonTypeInfo`1<System.__Canon>)
   at Json.Schema.EnumKeywordJsonConverter.Read(System.Text.Json.Utf8JsonReader ByRef, System.Type, System.Text.Json.JsonSerializerOptions)
   at Json.More.WeaklyTypedJsonConverter`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].Json.More.IWeaklyTypedJsonConverter.Read(System.Text.Json.Utf8JsonReader ByRef, System.Type, System.Text.Json.JsonSerializerOptions)
   at Json.Schema.JsonSerializerOptionsExtensions.Read(System.Text.Json.JsonSerializerOptions, System.Text.Json.Utf8JsonReader ByRef, System.Type, System.Text.Json.Serialization.Metadata.JsonTypeInfo)
   at Json.Schema.SchemaJsonConverter.Read(System.Text.Json.Utf8JsonReader ByRef, System.Type, System.Text.Json.JsonSerializerOptions)
   at System.Text.Json.Serialization.JsonConverter`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].TryRead(System.Text.Json.Utf8JsonReader ByRef, System.Type, System.Text.Json.JsonSerializerOptions, System.Text.Json.ReadStack ByRef, System.__Canon ByRef, Boolean ByRef)
   at System.Text.Json.Serialization.JsonConverter`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].ReadCore(System.Text.Json.Utf8JsonReader ByRef, System.Text.Json.JsonSerializerOptions, System.Text.Json.ReadStack ByRef)
   at System.Text.Json.JsonSerializer.ReadFromSpan[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.ReadOnlySpan`1<Byte>, System.Text.Json.Serialization.Metadata.JsonTypeInfo`1<System.__Canon>, System.Nullable`1<Int32>)
   at System.Text.Json.JsonSerializer.ReadFromSpan[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.ReadOnlySpan`1<Char>, System.Text.Json.Serialization.Metadata.JsonTypeInfo`1<System.__Canon>)
   at System.Text.Json.JsonSerializer.Deserialize[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.String, System.Text.Json.Serialization.Metadata.JsonTypeInfo`1<System.__Canon>)
   at Json.Schema.JsonSchema.FromText(System.String)
   at Json.Schema.JsonSchema.<Json.Schema.IBaseDocument.FindSubschema>g__ExtractSchemaFromData|54_0(Json.Pointer.JsonPointer, System.Text.Json.Nodes.JsonNode, Json.Schema.JsonSchema, <>c__DisplayClass54_0 ByRef)
   at Json.Schema.JsonSchema.<Json.Schema.IBaseDocument.FindSubschema>g__CheckResolvable|54_1(System.Object, Int32 ByRef, System.String, Json.Schema.JsonSchema ByRef, <>c__DisplayClass54_0 ByRef)
   at Json.Schema.JsonSchema.Json.Schema.IBaseDocument.FindSubschema(Json.Pointer.JsonPointer, Json.Schema.EvaluationOptions)
   at Json.Schema.RefKeyword.GetConstraint(Json.Schema.SchemaConstraint, System.ReadOnlySpan`1<Json.Schema.KeywordConstraint>, Json.Schema.EvaluationContext)
   at Json.Schema.JsonSchema.PopulateConstraint(Json.Schema.SchemaConstraint, Json.Schema.EvaluationContext)
   at Json.Schema.JsonSchema.GetConstraint(Json.Pointer.JsonPointer, Json.Pointer.JsonPointer, Json.Pointer.JsonPointer, Json.Schema.EvaluationContext)
   at Json.Schema.PropertiesKeyword+<>c__DisplayClass9_0.<GetConstraint>b__0(System.Collections.Generic.KeyValuePair`2<System.String,Json.Schema.JsonSchema>)
   at System.Linq.Enumerable+SelectEnumerableIterator`2[[System.Collections.Generic.KeyValuePair`2[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].ToArray()
   at Json.Schema.PropertiesKeyword.GetConstraint(Json.Schema.SchemaConstraint, System.ReadOnlySpan`1<Json.Schema.KeywordConstraint>, Json.Schema.EvaluationContext)
   at Json.Schema.JsonSchema.PopulateConstraint(Json.Schema.SchemaConstraint, Json.Schema.EvaluationContext)
   at Json.Schema.JsonSchema.GetConstraint(Json.Pointer.JsonPointer, Json.Pointer.JsonPointer, Json.Pointer.JsonPointer, Json.Schema.EvaluationContext)
   at Json.Schema.RefKeyword.GetConstraint(Json.Schema.SchemaConstraint, System.ReadOnlySpan`1<Json.Schema.KeywordConstraint>, Json.Schema.EvaluationContext)
   at Json.Schema.JsonSchema.PopulateConstraint(Json.Schema.SchemaConstraint, Json.Schema.EvaluationContext)
   at Json.Schema.JsonSchema.GetConstraint(Json.Pointer.JsonPointer, Json.Pointer.JsonPointer, Json.Pointer.JsonPointer, Json.Schema.EvaluationContext)
   at Json.Schema.PropertiesKeyword+<>c__DisplayClass9_0.<GetConstraint>b__0(System.Collections.Generic.KeyValuePair`2<System.String,Json.Schema.JsonSchema>)
   at System.Linq.Enumerable+SelectEnumerableIterator`2[[System.Collections.Generic.KeyValuePair`2[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].ToArray()
   at Json.Schema.PropertiesKeyword.GetConstraint(Json.Schema.SchemaConstraint, System.ReadOnlySpan`1<Json.Schema.KeywordConstraint>, Json.Schema.EvaluationContext)
   at Json.Schema.JsonSchema.PopulateConstraint(Json.Schema.SchemaConstraint, Json.Schema.EvaluationContext)
   at Json.Schema.JsonSchema.GetConstraint(Json.Pointer.JsonPointer, Json.Pointer.JsonPointer, Json.Pointer.JsonPointer, Json.Schema.EvaluationContext)
   at Json.Schema.OneOfKeyword+<>c__DisplayClass6_0.<GetConstraint>b__0(Json.Schema.JsonSchema, Int32)
   at System.Linq.Enumerable+<SelectIterator>d__229`2[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Collections.Generic.LargeArrayBuilder`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].AddRange(System.Collections.Generic.IEnumerable`1<System.__Canon>)
   at System.Collections.Generic.
========== Test run aborted: 0 Tests (0 Passed, 0 Failed, 0 Skipped) run in < 1 ms ==========
gregsdennis commented 1 week ago

Hey there. I'm the maintainer of JsonSchema.Net. I'm happy to have a look if you'd like to refile this issue over in the json-everything repository.

This repository is for the specification itself and not the right place for implementation-specific questions.

gregsdennis commented 1 week ago

I have discovered the problem, and it's probably specific to my implementation.

The short answer is that you have your $defs kinda messed up. It needs to be a flat dictionary, not a nested folder lookup.


What's happening is that it's reading /$defs/components as a schema, which means all of the keys under it are treated as unknown keywords. As a result, it doesn't try to deserialize them... until they're referenced. And it deserializes them each time they're referenced (probably something I can try to fix up).

So every time you reference something like #/$defs/components/tokens/identifier, this happens:

SchemaRegistry only tracks named subschemas, ones that contain $id, $anchor, or $dynamicAnchor (or $recursiveAnchor for 2019-09). It doesn't track pointer-identified subschemas. So when you point into an unknown keyword, it has to go through a lot of deserialization. When you then do that recursively... you saw what happened.

The solution is to flatten your $defs and don't point $ref to locations where subschemas aren't expected. Fix that up and it'll easily validate the instances. (I tried it.)

gregsdennis commented 1 week ago

(Just out of curiosity, why are you serializing Linq expressions? And then, why do you need to validate them?)

vmelamed commented 1 week ago

Moved the issue to json-everything (sorry!) I'll add answers to your other questions there.

gregsdennis commented 1 week ago

No worries. I was curious and impatient 😆