vpenades / SharpGLTF

glTF reader and writer for .NET Standard
MIT License
454 stars 72 forks source link

CodeGen - Self referencing array in json schema results in stackoverflow #236

Open windowslucker1121 opened 1 month ago

windowslucker1121 commented 1 month ago

Hey Folks, can someone provide me a helping hand in this scenario:

I have a schema defined, which should have some data and can have multiple objects of itself in an array like defined below:

    "$schema": "http://json-schema.org/draft-04/schema",
    "title": "ProcessNode Schema",
    "type": "object",
    "description": "glTF extension for ProcessNode.",
    "properties": {
        "Process": {
            "type": "array",
            "items": {
                "$ref": "#"
            },
            "description": "A list of Process elements.",
            "minItems": 0
        },
        "EventHandler": {
            "description": "A list of EventHandler elements",
            "type":"array",
            "items":{
                "method": {
                    "type":"string"
                }
            },
            "minItems": 0
          },
        "Function": {
            "type": "string"
        },        
       "id": {
            "type": "string",
            "description": "Unique identifier for the ProcessNode."
        }
     },
    "required": [
        "id"
    ]
}

But when running the CodeGen Tool it is resulting in an stackoverflow:

Stack overflow.
   at System.Linq.Enumerable.Select[[System.__Canon, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Collections.Generic.IEnumerable`1<System.__Canon>, System.Func`2<System.__Canon,System.__Canon>)

.
.
.
.
   at SharpGLTF.SchemaReflection.SchemaTypesReader._UseType(Context, NJsonSchema.JsonSchema, Boolean)
   at SharpGLTF.SchemaReflection.SchemaTypesReader._UseType(Context, NJsonSchema.JsonSchema, Boolean)
   at SharpGLTF.SchemaReflection.SchemaTypesReader._UseType(Context, NJsonSchema.JsonSchema, Boolean)
   at SharpGLTF.SchemaReflection.SchemaTypesReader._UseType(Context, NJsonSchema.JsonSchema, Boolean)
   at SharpGLTF.SchemaReflection.SchemaTypesReader._UseType(Context, NJsonSchema.JsonSchema, Boolean)
   at SharpGLTF.SchemaReflection.SchemaTypesReader.Generate(NJsonSchema.CodeGeneration.CSharp.CSharpTypeResolver)
   at SharpGLTF.SchemaProcessing.LoadSchemaContext(System.String)

how would i prevent that, or tell the codegen tool that it should only process this node on the top level once and reference it then?

I cant find anything like my schema in the already defined schemas and i cant seem to find a apropriate function in the codegen tool, thats why im out of expertise here.

Thanks in advance!

vpenades commented 1 month ago

This is probably a bug, since self references are something I was not expecting when I wrote the generator.

Most probably the solution is to put a barrier somewhere to prevent reentrancy when it detects that some type is already in.

basically _UseType needs to cache the result value in some dictionary, and if it _UseType is called again with the same parameters, use the cached value in the dictionary instead of doing a full reprocessing.

I am extremely busy lately, so I don't know when I'll have time to look into it. If you're in a hurry, I would suggest to try fix it yourself, and maybe create a pull request with the solution.

windowslucker1121 commented 1 month ago

I investigated this type of error and it is exactly the error you mentioned. After 3 hours of trying i think im not capable enough of fixing this issue.

What i did is created a list of already processed schemas and hold them in a cache like dictonary. Then i tried reusing the cached SchemaTypes if they where already processed but then i found out, that i cant reuse it, because we are in recursive loop and that schema isnt fully processed at the current time.

Then i tried replacing the current processing schema, which was already processed in the cache, with an placeholder schema. but now im stuck with it beeing a placeholder.

This is how it looks currently:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
using Newtonsoft.Json.Schema;
using JSONSCHEMA = NJsonSchema.JsonSchema;

namespace SharpGLTF.SchemaReflection
{
    public class SchemaTypePlaceholder : SchemaType
    {
        public override string PersistentName => "Placeholder";

        public string PlaceholderTarget;
        public SchemaTypePlaceholder(Context ctx) : base(ctx)
        {
        }
    }

    static class SchemaTypesReader
    {
        public static SchemaType.Context Generate(NJsonSchema.CodeGeneration.CSharp.CSharpTypeResolver types)
        {
            var context = new SchemaType.Context();
            var schemasProcessed = new Dictionary<JSONSCHEMA, SchemaType>();

            foreach (var t in types.Types.Keys)
            {
                Console.WriteLine(t.DocumentPath);
                context._UseType(t, schemasProcessed, new HashSet<JSONSCHEMA>());
            }

            return context;
        }

        private static SchemaType _UseType(this SchemaType.Context ctx, JSONSCHEMA schema, Dictionary<JSONSCHEMA, SchemaType> schemasProcessed, HashSet<JSONSCHEMA> schemaStack, bool isRequired = true)
        {
            if (ctx == null) throw new ArgumentNullException(nameof(ctx));
            if (schema == null) throw new ArgumentNullException(nameof(schema));

            if (schemasProcessed.TryGetValue(schema, out var existingType))
            {
                return existingType;
            }

            if (schemaStack.Contains(schema))
            {
                if (schemasProcessed.ContainsKey(schema))
                {
                    return schemasProcessed[schema];
                }
                else
                {
                    throw new InvalidOperationException("Recursive schema reference detected.");
                }
            }

            schemaStack.Add(schema);
            var placeholder = new SchemaTypePlaceholder(null);
            placeholder.PlaceholderTarget = schema.DocumentPath;
            schemasProcessed[schema] = placeholder;
            SchemaType result = null;

            try
            {
                if (schema is NJsonSchema.JsonSchemaProperty prop)
                {
                    isRequired &= prop.IsRequired;
                }

                if (_IsStringType(schema))
                {
                    result = ctx.UseString();
                }
                else if (_IsBlittableType(schema))
                {
                    bool isNullable = !isRequired;

                    if (schema.Type == NJsonSchema.JsonObjectType.Integer) result = ctx.UseBlittable(typeof(Int32).GetTypeInfo(), isNullable);
                    else if (schema.Type == NJsonSchema.JsonObjectType.Number) result = ctx.UseBlittable(typeof(Double).GetTypeInfo(), isNullable);
                    else if (schema.Type == NJsonSchema.JsonObjectType.Boolean) result = ctx.UseBlittable(typeof(Boolean).GetTypeInfo(), isNullable);
                    else throw new NotImplementedException();
                }
                else if (schema.HasReference)
                {
                    result = ctx._UseType(schema.ActualTypeSchema, schemasProcessed, schemaStack, isRequired);
                }
                else if (schema.IsArray)
                {
                    var elementType = ctx._UseType(schema.Item.ActualSchema, schemasProcessed, schemaStack);
                    result = ctx.UseArray(elementType);
                }
                else if (_IsEnumeration(schema))
                {
                    if (schema is NJsonSchema.JsonSchemaProperty property)
                    {
                        bool isNullable = !isRequired;

                        var dict = new Dictionary<string, Int64>();

                        foreach (var v in property.AnyOf)
                        {
                            var key = v.Description;
                            var val = v.Enumeration?.FirstOrDefault();
                            var ext = v.ExtensionData?.FirstOrDefault() ?? default;

                            if (val is String txt)
                            {
                                System.Diagnostics.Debug.Assert(v.Type == NJsonSchema.JsonObjectType.None);

                                key = txt; val = (Int64)0;
                            }

                            if (v.Type == NJsonSchema.JsonObjectType.None && ext.Key == "const")
                            {
                                key = (string)ext.Value; val = (Int64)0;
                            }

                            if (v.Type == NJsonSchema.JsonObjectType.Integer && ext.Key == "const")
                            {
                                val = (Int64)ext.Value;
                            }

                            System.Diagnostics.Debug.Assert(key != null || dict.Count > 0);

                            if (string.IsNullOrWhiteSpace(key)) continue;

                            dict[key] = (Int64)val;
                        }

                        var name = string.Join("-", dict.Keys.OrderBy(item => item));

                        var etype = ctx.UseEnum(name, isNullable);

                        etype.Description = schema.Description;

                        foreach (var kvp in dict) etype.SetValue(kvp.Key, (int)kvp.Value);

                        if (dict.Values.Distinct().Count() > 1) etype.UseIntegers = true;

                        result = etype;
                    }
                    else
                    {
                        throw new NotImplementedException();
                    }
                }
                else if (_IsDictionary(schema))
                {
                    var key = ctx.UseString();
                    var val = ctx._UseType(_GetDictionaryValue(schema), schemasProcessed, schemaStack);

                    result = ctx.UseDictionary(key, val);
                }
                else if (_IsClass(schema))
                {
                    var classDecl = ctx.UseClass(schema.Title);

                    classDecl.Description = schema.Description;

                    if (schema.InheritedSchema != null)
                    {
                        classDecl.BaseClass = ctx._UseType(schema.InheritedSchema, schemasProcessed, schemaStack) as ClassType;
                    }

                    var keys = _GetProperyNames(schema);
                    if (schema.InheritedSchema != null)
                    {
                        var baseKeys = _GetInheritedPropertyNames(schema).ToArray();
                        keys = keys.Except(baseKeys).ToArray();
                    }

                    var props = keys.Select(key => schema.Properties.Values.FirstOrDefault(item => item.Name == key));

                    var required = schema.RequiredProperties;

                    foreach (var p in props)
                    {
                        var field = classDecl.UseField(p.Name);

                        field.Description = p.Description;

                        field.FieldType = ctx._UseType(p, schemasProcessed, schemaStack, required.Contains(p.Name));

                        field.ExclusiveMinimumValue = p.ExclusiveMinimum ?? (p.IsExclusiveMinimum ? p.Minimum : null);
                        field.InclusiveMinimumValue = p.IsExclusiveMinimum ? null : p.Minimum;
                        field.DefaultValue = p.Default;
                        field.InclusiveMaximumValue = p.IsExclusiveMaximum ? null : p.Maximum;
                        field.ExclusiveMaximumValue = p.ExclusiveMaximum ?? (p.IsExclusiveMaximum ? p.Maximum : null);

                        field.MinItems = p.MinItems;
                        field.MaxItems = p.MaxItems;
                    }

                    result = classDecl;
                }
                else if (schema.Type == NJsonSchema.JsonObjectType.Object)
                {
                    result = ctx.UseAnyType();
                }
                else if (schema.Type == NJsonSchema.JsonObjectType.None)
                {
                    result = ctx.UseAnyType();
                }
                else
                {
                    throw new NotImplementedException();
                }

                schemasProcessed[schema] = result;
                schemaStack.Remove(schema);

                return result;
            }
            catch
            {
                schemasProcessed.Remove(schema);
                schemaStack.Remove(schema);
                throw;
            }
        }

        private static bool _IsBlittableType(JSONSCHEMA schema)
        {
            if (schema == null) return false;
            if (schema.Type == NJsonSchema.JsonObjectType.Boolean) return true;
            if (schema.Type == NJsonSchema.JsonObjectType.Number) return true;
            if (schema.Type == NJsonSchema.JsonObjectType.Integer) return true;

            return false;
        }

        private static bool _IsStringType(JSONSCHEMA schema)
        {
            return schema.Type == NJsonSchema.JsonObjectType.String;
        }

        private static bool _IsEnumeration(JSONSCHEMA schema)
        {
            if (schema.Type != NJsonSchema.JsonObjectType.None) return false;

            if (schema.IsArray || schema.IsDictionary) return false;

            if (schema.AnyOf.Count == 0) return false;

            return true;
        }

        private static bool _IsDictionary(JSONSCHEMA schema)
        {
            if (schema.AdditionalPropertiesSchema != null) return true;
            if (schema.AllowAdditionalProperties == false && schema.PatternProperties.Any()) return true;

            return false;
        }

        private static JSONSCHEMA _GetDictionaryValue(JSONSCHEMA schema)
        {
            if (schema.AdditionalPropertiesSchema != null)
            {
                return schema.AdditionalPropertiesSchema;
            }

            if (schema.AllowAdditionalProperties == false && schema.PatternProperties.Any())
            {
                var valueTypes = schema.PatternProperties.Values.ToArray();

                if (valueTypes.Length == 1) return valueTypes.First();
            }

            throw new NotImplementedException();
        }

        private static bool _IsClass(JSONSCHEMA schema)
        {
            if (schema.Type != NJsonSchema.JsonObjectType.Object) return false;

            return !string.IsNullOrWhiteSpace(schema.Title);
        }

        private static string[] _GetProperyNames(JSONSCHEMA schema)
        {
            return schema
                    .Properties
                    .Values
                    .Select(item => item.Name)
                    .ToArray();
        }

        private static string[] _GetInheritedPropertyNames(JSONSCHEMA schema)
        {
            if (schema?.InheritedSchema == null) return Enumerable.Empty<string>().ToArray();

            return _GetInheritedPropertyNames(schema.InheritedSchema)
                .Concat(_GetProperyNames(schema.InheritedSchema))
                .ToArray();
        }
    }
}
vpenades commented 3 weeks ago

I've found this: https://stackoverflow.com/questions/35250621/recursive-self-referencing-json-schema

Not sure if it's relevant to the issue