corvus-dotnet / Corvus.JsonSchema

Support for Json Schema validation and entity generation
Apache License 2.0
99 stars 9 forks source link

Code generator does not emit a valid field name for an empty strings in enum class values #370

Closed manekovskiy closed 3 months ago

manekovskiy commented 3 months ago

Example of schema that limits allowed value to "" (empty string) for the foo field value:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "foo": {
      "type": "string",
      "enum": [
        "",
        "bar"
      ]
    }
  }
}

Command to generate schema models: generatejsonschematypes "empty-string-value-enum-schema.json" --outputPath "." --rootNamespace "TestSchema" --outputRootTypeName "IssueDemonstration"

A field name is missing for the empty string value in the IssueDemonstration.FooEntity.EnumValues class:


public readonly partial struct IssueDemonstration
{
    /// <summary>
    /// Generated from JSON Schema.
    /// </summary>
    public readonly partial struct FooEntity
    {
    ...
        /// <summary>
        /// Permitted values.
        /// </summary>
        public static class EnumValues
        {
        ...
            /// <summary>
            /// Gets "" as a JSON value.
            /// </summary>
            public static readonly FooEntity =  FooEntity.Parse( "\"\"" ) ; 
                                         /* ^ Field name is missing */

A potential fix would be to emit a well-known name for an empty string value, something like EmptyString or "_". The problem with this approach is that we might encounter a case where enum is defined as ["", "EmptyString"] which will cause a naming conflict.

I understand that it is very unlikely to devise a good naming strategy for all non-digit or non-letter values. However, perhaps there is room for special treatment of empty string enum values?


A workaround is to use "anyOf" with a "maxLength" constraint for the "empty string" value:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "foo": {
      "anyOf": [
        {
          "type": "string",
          "maxLength": 0
        },
        {
          "type": "string",
          "enum": [
            "bar"
          ]
        }
      ]
    }
  }
}

The issue with this workaround is that it produces two times more files and makes it somewhat less intuitive to work with the model because now one needs to use an empty string literals/constants side-by-side with generated enum class values.

mwadams commented 3 months ago

Good catch! I'll get a fix for this out today, with the usual de-duplication logic for name collisions.

mwadams commented 3 months ago

One appealing (and frequently recommended) way of creating string enums is not to use enum at all, but to use

{ 
  "anyOf": [
     { "$ref": "#/$defs/Value1" },
     { "$ref": "#/$defs/Value2" },
     { "$ref": "#/$defs/Value3" }
  ],
  "$defs": {
    "Value1": {
      "title": "Documentation title Value1",
      "description": "Longform documentation for the enum value",
      "type": "string",
     "const": ""
  },
  "Value2": {
      "title": "Documentation title Value2",
      "description": "Longform documentation for the enum value",
      "type": "string",
     "const": "Foo"
  },
  "Value3": {
      "title": "Documentation title Value3",
      "description": "Longform documentation for the enum value",
      "type": "string",
     "const": "Bar"
  }
}

This is the best way to also be able to document the enum values, and you still get a single Match() pattern-matching function for consumption.

As you say, the structure is more verbose if you need to explicitly access the const values: (Value1.ConstInstance, Value2.ConstInstance) , but it is much better for documentation.

I talk a bit about this in this article https://endjin.com/blog/2024/05/json-schema-patterns-dotnet-numeric-enumerations-and-pattern-matching

manekovskiy commented 3 months ago

Pleasantly surprised by such a quick turnaround. This fix significantly simplifies my work, as I won't have to go through nine circles of hell rewriting an existing schema that people already got used to. Thank you!

mwadams commented 3 months ago

Thanks for a really clear bug report!

It is churning through the release process right now. Should be up on nuget in about 15 minutes.